Journal 


of Speech and Hearing 


Research 


September 1960 VOLUME 3 © NUMBER 3 


Consonantal Nasal Pressure in Cleft Palate Speakers 
DONALD A. HESS AND EUGENE T. MCDONALD 


Phoneme Perception in Lipreading 
MARY F. WOODWARD AND CARROLL G. BARBER 


Sugar Placebos and Stuttering 
JAMES R. PALASEK AND W. SCOTT CURTIS 


Cinefluorographic Techniques in Speech Research 
KENNETH L. MOLL 


Color-Form Attitudes of Deaf Children 
DONALD G. DOEHRING 


Factors Affecting “PHPESHOIAS £OF SHOLE +E ONES... 6.605. 00e cceececens 
ROBERT GOLDSTEIN AND JOAN C. KRAMER 


Agrammatism and Inflectional Morphology in English 
HAROLD GOODGLASS AND JEAN BERKO 


Nasal Syllabics in American English 
ANDRE MALECOT 


Bekesy Audiometry in Analysis of Auditory Disorders 
JAMES JERGER 


Electrophysiologic Responsiveness and Alpha Rhythm in Children 
SIDNEY SCHOENFELD AND ROBERT GOLDSTEIN 


Sequence of Action of Breathing Muscles during Speech 
MICHAEL S. HOSHIKO 


Book Reviews 


Research News Notes 








The American 


Speech and Hearing 
Association 


OFFICERS 


President 
Stanley Ainsworth, Ph.D. 
University of Georgia 


Executive Vice-President 
Jack Matthews, Ph.D. 
University of Pittsburgh 


Vice-President 
Jack L. Bangs, Ph.D. 
Houston Speech and Hearing Center 


Editor of the Association 
Wendell Johnson, Ph.D. 
University of Iowa 


OFFICERS-ELECT 


President-Elect 
G. Paul Moore, Ph.D. 
Northwestern University 


Vice-President-Elect 
Duane C. Spriestersbach, Ph.D. 
University of Iowa 


COUNCIL 

The Officers and the following 
Councilors: 

George A. Kopp, Ph.D. (1960) 
Oliver Bloodstein, Ph.D. (1960-62) 
William G. Hardy, Ph.D. (1960-62) 
Ira J. Hirsh, Ph.D. (1958-60) 

Ruth B. Irwin, Ph.D. (1960-63) 
James F. Jerger, Ph.D. (1959-61) 
Hayes A. Newby, Ph.D. (1960-63) 
Wilbert L. Pronovost, Ph.D. (1958-60) 
Dean E. Williams, Ph.D. (1959-61) 


EXECUTIVE SECRETARY 
Kenneth O. Johnson, Ph.D. 


APPLICATIONS FOR MEMBERSHIP SHOULD BE ADDRESSED TO THE EXECUTIVE SECRETARY 





The Journal 
of Speech and Hearing 
Research 


EDITOR 
Dorothy Sherman, Ph.D. 


ASSISTANT TO THE EDITOR 
Dorothy W. Moeller 


STATISTICAL CONSULTANT 
Leonard S. Feldt, Ph.D. 


ASSOCIATE EDITORS 

Oliver Bloodstein, Ph.D. 
Arthur S. House, Ph.D. 
James Jerger, Ph.D. 

D. E. Morley, Ph.D. 
Hildred Schuell, Ph.D. 
Arnold M. Small, Ph.D. 
William R. Tiffany, Ph.D. 
John C. Webster, Ph.D. ) 
Joseph M. Wepman, Ph.D. 


ASSISTANT EDITORS 
Kenneth L. Moll, Ph.D. 
Martin A. Young, Ph.D. 


DEPARTMENT EDITORS 


Ernest H. Henrikson, Ph.D. , 
Book Reviews 


Martin F. Palmer, Sc.D. 
Records 


BUSINESS MANAGER 
Kenneth O. Johnson, Ph.D. 





Consonantal Nasal Pressure 


in Cleft Palate Speakers 


DONALD A. HESS 


EUGENE T. 


There is general agreement (3, 5, 6, 8, 
10, 14, 15) that nasal emission is fairly 
common in cleft palate speech. Mc- 
Donald and Baker (6) have suggested 
that the term ‘nasal emission’ be re- 
strictively employed to describe the 
escape of air through the nasal passages 
when the speaker attempts to produce 
any sound requiring intraoral breath 
pressure, such as in producing plosives 
and fricatives. It is to be distinguished 
from nasality, which they characterize 
as a resonance phenomenon. Their de- 
scription of this clinical entity was cor- 
roborated by the findings of Counihan 
(3), which indicate that nasal emission 
among cleft palate speakers is perceived, 
if at all, in the plosive and fricative 
sounds, but not in nasals and semi- 
vowels. Counihan also reported that 
nasal emission was judged to be present 
considerably more often on voiceless 
consonants than on their voiced coun- 





Donald A. Hess (D.Ed., Pennsylvania State 
University, 1955) is Professor of Education 
and Psychology, and Director, Speech and 
Hearing Clinic, Indiana State College, Indi- 
ana, Penn. Eugene T. McDonald (DEd., 
Pennsylvania State University, 1942) is Pro- 
fessor of Speech and Speech Education, and 
Director, Speech and Hearing Clinic, Pennsyl- 
vania State University. This article is an 
adaptation of a paper presented at the 1958 
Convention of the American Speech and 
Hearing Association, New York. 


Volume 3, No. 3 


McDONALD 


201 


terparts. In this regard his results par- 
allel those of Black (1), who found that 
in normal speech voiceless consonants 
involve significantly greater intraoral 
pressure than their voiced cognates 
when means for these two groups of 
sounds are compared. Black also found 
significantly greater over-all mean intra- 
oral pressure for the fricative sounds 
[f], [v], [s], and [z] than for the 
plosive sounds [p], [b], [t], and [d]. 
His results might suggest that cleft 
palate speakers, in their attempt to 
build up adequate intraoral pressure, 
characteristically have greater nasal air 
escape on fricatives than on plosives. 
Although Counihan (3) reported de- 
tection of nasal emission for 17.41% of 
the fricatives and 16.06% of the plo- 
sives tested, he placed no significant 
interpretation on the slight difference 
in the frequency with which these two 
types of consonants were judged to 
involve perceptible nasal emission. 


The present study is an attempt (a) 
to determine and evaluate the differ- 
ences in consonantal nasal pressure 
which may exist in the articulation of 
cleft palate speakers, and (b) to de- 
termine those consonants that represent 
the best diagnostic indicators of the 
cleft palate person’s over-all average 
nasal pressure. 


September 1960 








202 Journal of Speech and Hearing Research 


Procedure 


Equipment. The equipment used to 
test the subjects was a U-tube water 
manometer mounted on a_ vertical 
wooden support. The over-all vertical 
height of the glass tubing was 15 in. Its 
bore and outside diameters were 3/16 
in. and .25 in., respectively. A scale, 
graduated in half centimeters, was 
placed behind the open glass tubing on 
one side, with its zero reference line at 
normal water level. The top of the glass 
tubing on the other side was connected 
to a glass Y-tube by a one-foot length 
of rubber tubing. From each wing of 
the Y-tube six-inch lengths of rubber 
tubing were connected to two-inch 
lengths of plastic tubing, the ends of 
which contained nasal olives which 
were individually molded from dental 
wax to ensure a complete seal in the 
nostrils. 


Subjects. The subjects were 20 per- 
sons with cleft palates, 15 males and five 
females. Their ages ranged from six to 
17 years, with a mean age of about 12 
years. Nine of the subjects had _post- 
operative palates, 11 were fitted with 
prostheses. Half of the subjects were 
enrolled in the Residential Speech and 
Hearing Program at the Pennsylvania 
State University and were tested during 
the final two weeks of a six-weeks 
therapy period. The other 10 subjects 
were tested at the Speech and Hearing 
Clinic, Indiana State College, Indiana, 
Pennsylvania; five of these were en- 
rolled for speech therapy at the time 
they were tested, three had been dis- 
charged as satisfactory, and two were 
not receiving speech therapy although 
their speech was not judged to be satis- 
factory. 


Test Procedure. The subjects were 
instructed to articulate two types of 
speech samples, each in a separate test 
session: (a) a group of 24 monosyl- 
lables of consonant-vowel (CV) type 
in which each consonant except [ny] 
was combined with the vowel [a], for 
example, [pa], [ba], [ma], and (b) a 
group of 24 trisyllables of CVCVCV 
type which were constructed in the 
same manner but articulated repetitive- 
ly for example, [papapa], [bababa], 
[mamama], at a rate of about one syl- 
lable per half second, with each sylla- 
ble receiving equal stress. Nasal pressure 
in both monosyllables and trisyllables 
was studied since it was felt that a sub- 
ject might be able to obtain an adequate 
velopharyngeal closure for the con- 
trolled production of one syllable but 
might not be able to do so for the 
repetitive production of the syllable. 

Each subject was required to produce 
each speech condition on three separate 
trials. Thus for either type of speech 
sample a total of 72 trials (three trials 
for each of 24 speech conditions) was 
given. The 72 trials were randomized 
to minimize the possible serial effects of 
practice and fatigue. After every five 
trials in this random order the seal of 
the nasal olives was checked and, where 
necessary, the five trials were repeated. 
Nasal pressure readings in centimeters 
of water height from the zero reference 
level! were obtained from each of the 
72 trials for each subject. The highest 
point (to the nearest half centimeter) 
to which the water level rose in the 
glass tubing was recorded as the nasal 
pressure reading for each trial. Using 


*To convert all reported nasal pressure 
readings to actual pressure, the nasal pressure 
readings are multiplied by two to account 
for the difference in the water levels. 





Hess, McDonald: Consonantal Nasal Pressure 203 


this method of recording nasal pressure 
readings in the complete preliminary 
test of one subject, the writers obtained 
an average difference between their 
readings of only .17 cm. On no trial 
was the difference in their readings 
greater than .5 cm. Therefore it was 
assumed that nasal pressure readings 
obtained as criterion measures by one 
of the writers would be sufficiently 
accurate. 


Phonemic Accuracy of Subjects. The 
subjects were trained to produce the 
consonants-in-syllable as correctly as 
possible before the readings were taken. 
A preliminary check of 10 subjects was 
made to discover how well they could 
articulate consonants-in-trisyllables fol- 
lowing auditory and visual stimulation. 
Three experienced clinicians were 
asked to judge whether each consonant- 
in-trisyllable produced by each of the 
subjects was correctly or incorrectly 
articulated. The judges were instructed 
to disregard audible nasal air escape as 
a possible influence on their judgments 
of phonemic accuracy. In other words, 
if a given subject exhibited correct 
phonetic placement and acoustic oral 
accuracy, the tested sound was con- 
sidered to be correctly articulated even 
though audible air escape might have 
accompanied its articulation. Each sub- 
ject was allowed as many as three trials 
following careful stimulation before 
the final judgment of each consonant- 
in-trisyllable was made. From the medi- 
an judgments it was determined that 
all 10 subjects could correctly produce 
[p], [b], [m], [w], [f], [t], [4], [a], 
[s], [j], [k], and [h]; nine could cor- 
rectly produce [v], [0], [6], [z], [tf], 
[1], [r], and [g]; eight could correctly 
produce [J]; seven could correctly 


produce [3] and [d3]; and only five 
could correctly produce [m]. In a sim- 
ilar check of the other 10 subjects by 
one of the writers, it was judged that 
all 10 subjects could correctly produce 
[p], [b], [m], [w], [f], [v], [9], [6], 
[In], [f], [tf], (d3], (j], (U. Ck), (gl, 
and [h]; nine could correctly produce 
[m], [t], [s], [z], [3], and [r]; and 
eight could correctly produce [d]. As- 
suming that both judgment procedures 
were valid, it would appear that con- 
sonantal misarticulation did not materi- 
ally affect the results of the study. Only 
29 misarticulations following auditory 
and visual stimulation were discovered 
in the total of 480 trials. 


It is pertinent that this observation 
is consistent with the findings of Scott 
and Milisen (9) for subjects with func- 
tional misarticulations. Their results 
indicate that on the average their sub- 
jects correctly articulated ordinarily 
defectively articulated consonants 67% 
and 68% of the time in the initial and 
medial positions, respectively, of non- 
sense syllables following auditory and 
visual stimulation. The higher percent- 
age (94%) of correct articulation 
among the present subjects following 
auditory and visual stimulation can be 
partly explained by the fact that the 
present speech sample involved 24 
consonants, some of which the subjects 
ordinarily articulated correctly. More- 
over, four of the present subjects had 
essentially normal articulation; three, 
as noted earlier, were discharged as 
satisfactory following speech therapy, 
and a fourth was receiving speech 
therapy to reduce hypernasality, even 
though her articulation was normal. 
Finally, the fact that audible nasal 
escape per se was not considered, by 








204 Journal of Speech and Hearing Research 


operational definition, as a characteristic 
of misarticulation undoubtedly en- 
hanced the present subjects’ chances of 
earning judgments of correct articula- 
tion in the test syllables. 

As previously explained, during the 
actual testing the subjects were en- 
couraged to produce the consonants- 
in-syllable as accurately as possible. 
Nasal pressure readings were recorded 
only when the examiner was satisfied 
that the subject was articulating as well 
as possible. Each stimulus monosyllable 
and trisyllable was presented orally to 
the subject by the examiner. Since the 
subjects were responding to an aural 
and visual stimulus, there is good reason 
to believe that they articulated as pro- 
ficiently in the final test as they did 
in the preliminary test. Moreover, the 
presence of wax nasal olives in their 
nostrils undoubtedly helped them to 
articulate the test syllables correctly. 
It is a common clinical observation that 
cleft palate persons improve in their ar- 
ticulation of pressure consonants while 
their nostrils are pinched or other- 
wise occluded. Although there may 
be impounding of air in the nasal cavity 
due to faulty velopharyngeal valving, 
occluding the nostrils usually results in 
more precise positioning of the articu- 
lators and a more acoustically accurate 
oral production of the sound. To a 
certain extent, the experimental ap- 
paratus effected a similar result, since 
any nasal air leak would be impounded 
in a closed air system. 


Results 


Consonantal Nasal Pressure. The con- 
sonants were classified according to 
manner of articulation as _ follows: 


Taste 1. Summary of analysis of variance to 
evaluate the influence of consonants on nasal 
pressure in CV monosyllables. 











Source df ms F* F ot 
Consonants (C) 23 «428.32 16.76 2.34 
Subjects (S) 19 18.70 
CS 437 1.69 3.93 1.95 
Within Cells (w) 960 43 

Total 1439 








*F ratios: msc/mscg; mscg/MSsy. 
tF 01 is the tabled value for the nearest given df. 


plosives, [p], [b], [t], [d], [k], and 
[g]; fricatives, [f], [v], [8], [5], [s], 
[z], [f], [3], and [h]; affricates, [tf], 
and [d3]; nasals, [m] and [n]; and 
glides, [], [w], [j], [1], and [r]. To 
determine the effect of consonants on 
nasal pressure, the data were evaluated 
by analysis of variance in a treatments- 
by-subjects design as described by 
Lindquist (4, pp. 156-171). Critical 
differences were employed to assess 
ifterconsonant mean differences. Means 
for consonant types by phonetic classi- 
fication were tabulated. The effect of 
voicing for 18 consonants (nine cog- 
nate pairs) was evaluated by analysis of 


TaBLE 2. Summary of analysis of variance to 
evaluate the influence of consonants on nasal 
pressure in CVCVCV trisyllables. 











Source df ms Ye F oit 
Consonants (C) 23 «21.91 17.67 2.34 
Subjects (S) 19 14.11 
CS 437 1.24 3.65 1.95 
Within Cells (w) 960 34 

Total 1439 








*F ratios: msc/mscg; Mscg/MSy. 
{F 1 is the tabled value for the nearest given df. 





Hess, McDonald: Consonantal Nasal Pressure 205 


‘Taste. 3. Mean consonantal nasal pressure read- 
ings for CV monosyllables and CVCVCV 
trisyllables. Critical differences (c.d.) were de- 
termined by the formula, c.d. = t(2mses/n)!/*. 
Appropriate error mean squares are found in 
Tables 1 and 2. (Critical differences at .01 level 
are 1.06 and .90; at .05 level, .80 and .69.) 





CV Monosyllables CVCVCV Trisyllables 





Cons Mean Ht = Cons Mean Ht 
(Cm Water) (Cm Water) 
[f] 2.39 {f] 2.18 
[s] 2.28 [s] 2.00 
[tf] 1.97 [tf] 1.92 
[8] 1.82 [8] 1.80 
{f] 1.81 [ds] 1.59 
[v] 1.68 [v] 1.59 
[ds] 1.67 [Pp] 1.56 
[5] 1.52 [f] 1.52 
[z] 1.52 [z] 1.43 
[p] 1.50 [t] 1.35 
[t] 1.24 [b] 1.27 
[g] 1.23 [3] 1.22 
[k] 1.14 [g] 1.17 
[b] 1.12 [d] 1.08 
[6] 1.08 [5] 1.07 
{d] 1.03 {k] 97 
[a] 71 [a] 90 
{m] oe [m] 84 
{[n] 41 {[n] 82 
[w] .28 {w] 42 
[r] 21 {r] 24 
[1] 21 {h] 17 
(jl 20 (j] 13 
[h]} .19 (1] 483 








TaBLE 4. Mean nasal pressure readings (mean 
height in centimeters of water) by consonant 
types for each speech sample. 











Consonant Speech Sample 

Type CV CVCVCV 
Monosyllables Trisyllables 

Affricates (N=2) 1.82 1.75 

Fricatives (N=9) 1.51 1.44 

Plosives (N=6) 1.21 1.23 

Nasals (N=2) .46 .83 

Glides (N=5) 32 36 








variance in a treatments-by-treatments- 
by-subjects design and appropriate 
critical differences (4, pp. 220-253). 
This analysis is merely a more analytical 
breakdown of the effect of consonants 
on nasal pressure, minus the six un- 
paired consonants. The analysis was 
based on the original criterion measures 
elicited from the subjects. 

In both types of speech samples, 
monosyllables and trisyllables, conso- 
nants were found to exert a significant 


Tasie 5. Summary of analysis of variance to 
evaluate the influence of voicing on nasal pressure 
in CV monosyllables. 











Source df ms F* F ot 
Voicing (V) 1 15.31 14.18 8.18 
Cognate Pairs (C) 8 9.63 10.58 2.66 
Subjects (S) 19 7.63 
vc 8 32 1.14 2.66 
vs 19 1.08 
CS 152 91 
VCS 152 28 

Total 359 








*F ratios: msy/msyg; msc/mscg; Msyc/MSycg, 
+F 91 is the tabled value for the nearest given df, 


effect (.01 level) on nasal pressure 
(Tables 1 and 2). 

Table 3 summarizes the mean nasal 
pressure readings for the consonants 
in each type of speech sample and 
indicates appropriate critical differences 
used to assess the interconsonant mean 
differences. In monosyllables, when 
each syllable is compared with each of 
the other syllables, there are 129 inter- 
consonant mean differences significant 
at the 1% level and 82 at the 5% level. 
In trisyllables there are 128 mean dif- 
ferences significant at the 1% level and 
90 at the 5% level. For both mono- 
syllables and trisyllables the consonants 








206 Journal of Speech and Hearing Research 


Taste 6. Summary of analysis of variance to 
evaluate the influence of voicing on nasal 


pressure in CVCVCYV trisyllables. 











Source df ms F* F oit 
Voicing (V) 1 42:67 29:47 8.18 
Cognate Pairs (C) 8 5.80 14.87 2.66 
Subjects (S) 19 5.53 
vc 8 .70 1:27 2.66 
vs 19 43 
Cs 152 39 
vcs 152 55 

Total 359 








*F ratios: msy/msyg; msc/mscg; Msy¢/Msycs. 
TF 9, is the tabled value for the nearest given df. 


are quite similarly ranked in mean nasal 
pressure. The rank-order correlation 
is .97. 


Taste 7. Mean differences in nasal pressure 
readings for voiceless and voiced consonants 
in CV monosyllables. Critical differences (c.d.) 
were determined by the formula, cd. = 
t(2msys/n)**. The appropriate error mean 
square is found in Table 5. (Critical dif- 
ferences for single consonant mean differences 
at .01 level, .94; at .05 level, .69; c.d. for over- 
all mean difference at .01 level, .31; at .05 
level, .23.) 











Consonant Mean Ht in Cm of Water 
Pairs Voiceless Voiced Mean 
Cons Cons Diff 

{p] — [b] 1.50 1.13 37 
[»] — [w] 71 .28 43 
(f] — [v] 2.39 1.68 a 
10] — [d] 1.82 1.08 74% 
[t] — [d] 1.24 1.02 22 
{s] — [z] 2.28 1.52 .76* 
(f] — [3] 1.81 1.53 28 
({t{] — [ds] 1.97 1.67 30 
(k] — {g] 1.14 1.23 09 
Overall Mean 1.65 1.24 Ali 








*Significant at .05 level. 
iSignificant at .01 level. 


The six consonants that yielded the 
highest mean nasal pressure readings 
in monosyllables are [f], [s], [tf], [6], 
[f], and [v], five fricatives and one 
affricate. At the 1% level of signifi- 
cance they exceed from seven to 14 
other consonants in mean nasal pres- 
sure; and at the 5% level, from eight 
to 17 other consonants. In trisyllables 


Taste 8. Mean differences in nasal pressure 
readings for voiceless and voiced consonants 
in CVCVCV trisyllables. Critical differences 
(c.d) were determined by the formula, c.d. = 
t(2msys/n)**. The appropriate error .mean 
square is found in Table 6. (Critical differences 
for single consonant mean differences, at .01 
level, .60; at .05 level, .44; c.d. for over-all 
mean difference, at .01 level, .20; at .05 level, 
15%) 











Consonant Mean Ht in Cm of Water 
Pairs Voiceless Voiced Mean 
Cons Cons Diff 
[p] — [b] 1.56 1.27 29 
[™] — [w] 90 42 48* 
[f] — [v] 2.18 1.59 59" 
{8] — [5] 1.80 1.07 724 
[t] — [4] 1.35 1.08 H 
[s] — [z] 2.00 1.43 by 
{f] — [3] 1.52 1.22 30 
{tf} — [ds] 1.92 1.59 33 
{k] — [g] a | 1.17 .20 
Overall Mean 1.58 1.20 387 





*Significant at .05 level. 
{Significant at .01 level. 


the six consonants with highest mean 
nasal pressure readings are as follows: 
[f], [s], [tf], [6], [d3], and [v], four 
fricatives and two affricates. These 


consonants exceed five to 14 and eight 
to 16 other consonants at the 1% and 
5% levels of significance, respectively. 
Similar inspection of the six consonants 
that yielded the lowest mean nasal pres- 
sure readings shows that in either type 





Hess, McDonald: Consonantal Nasal Pressure 207 


of speech sample they are as follows: 
[h], (j], (J, [r], fw], and [n], four 
glides, one nasal, and one aspirate fric- 
ative. In monosyllables they are ex- 
ceeded by 10 other consonants at the 
1% level and from 12 to 16 other 
consonants at the 5% level. In trisyl- 
lables they are exceeded by from four 
to 15 and from eight to 17 other con- 
sonants at the two respective levels of 
significance. 

Thus it would appear that fricatives 


pressure readings and glides and nasals 
the lowest. This impression is verified 
in Table 4. By consonant type the con- 
sonants in both kinds of speech samples 
rank in mean nasal pressure from high- 
est to lowest as follows: affricates, 
fricatives, plosives, nasals, and glides. 
Significant Fs are observed for the 
effect of voicing on mean nasal pressure 
in both types of speech samples (Tables 
5 and 6). Tables 7 and 8 show that the 
over-all voiceless consonant means in 


and affricates yield the highest nasal nasal pressure are significantly greater 
Taste 9. Rank-order correlations between consonantal nasal pressure and average nasal pres- 
sure of 20 cleft palate subjects. 











Consonants CV Monosyllables CVCVCV Trisyllables 
Mean Ht in Rho Mean Ht in Rho 
Cm of Water Cm of Water 
{f] 2.39 SF" 2.18 PY 
[s] 2.28 .87* 2.00 79* 
(f] 1.81 87* 1.52 78° 
[tf] 1.97 81* 1.92 82* 
[3] 1.52 81* 1.22 69* 
[t] 1.24 79° 1.35 62° 
[d] 1.03 78* 1.08 58* 
[d5] 1.67 78° 1.59 79° 
[z] 1.52 72* 1.43 68" 
[k] 1.14 71* 97 42 
[3] 1.08 .68* 1.07 45¢ 
[h] 19 67" 17 29 
[v] 1.68 65* 1.59 48t 
[n] 41 56* 82 44 
[r] Py | 537 24 487 
[g] 1.23 45t 1.17 55t 
[a] 71 42 90 31 
1] 21 40 13 28 
[8] 1.82 40 1.80 85* 
[pl 1.50 35 1.56 35 
[b] 1.12 35 1.27 .62* 
[w] 28 32 42 26 
{il 20 30 13 32 
[m] 52 ll 84 25 








*Significant at .01 level. 
+Significant at .05 level. 








208 Journal of Speech and Hearing Research 


(.01 level) than those for their voiced 
counterparts. The voiceless consonant 
in each of the following cognate pairs 
has significantly greater mean nasal 
pressure (.05 level or better) in both 
monosyllables and trisyllables: [f]-[v], 
[9]-[6], and [s]-[z]. Also in trisyl- 
lables [m] significantly exceeds [w] in 
mean nasal pressure (.05 level). 


Correlation of Consonantal and Aver- 
age Nasal Pressure. By ranking the 20 
subjects in nasal pressure for each 
consonant and in mean nasal pressure 
for all consonants, rank-order correla- 
tions were obtained between conso- 
nantal and average nasal pressure for 
both types of speech samples. The 
magnitude of these rhos indicates the 
extent to which each consonant acts as 
a diagnostic indicator of average nasal 
pressure for all subjects. The results of 
this procedure are summarized in Table 
9. 


The consonants in monosyllables that 
correlate significantly (.05 level or 
better) with average nasal pressure are 
as follows: [f], [s], [J], [tf], [3], [tl], 
[4], [45], [2], (kl, [3], (hl, {vl, 
[n], [r], and [g]. In trisyllables they 
are as follows: [6], [tf], [s], [d3], [J], 
[f], [3], [2], [t], [b], (dl, [gl, (vl), 
[r], and [6]. In both types of speech 
samples, with few exceptions, conso- 
nants with higher mean nasal pressure 
tend to correlate more highly with 
average nasal pressure than consonants 
with lower mean nasal pressure. 


Discussion 


The present findings for consonantal 
nasal pressure in cleft palate speakers 
parallel Black’s (1) results, which indi- 
cate that normal speakers effect greater 


intraoral pressure on the fricatives than 
on the plosives and on voiceless sounds 
than on voiced sounds. Thus it would 
appear that with a greater normal re- 
quirement of intraoral pressure for the 
articulation of certain consonants, cleft 
palate speakers exhibit greater nasal 
air escape in their efforts to articulate 
these consonants. 


The present finding that in cleft pal- 
ate speakers nasal pressure is greater 
for affricates, fricatives, and, to a lesser 
extent, plosives, while it is less on nasals 
and particularly on glides, corroborates 
the definition of nasal emission by Mc- 
Donald and Baker (6). It also tends 
to explain why nasal emission was per- 
ceived only during the articulation of 
fricatives and plosives in the Counihan 
(3) study (which made no distinction 
between fricative and affricate conso- 
nants). Undoubtedly the nasal pressure 
on the glides is so little that it is per- 
ceptually inconsequential. Nasal pres- 
sure on nasal sounds may either be too 
little to permit detection or the nasal 
air escape is masked by the natural 
nasal resonance characterizing these 
sounds. The present finding of signifi- 
cantly greater nasal pressure on the 
voiceless sounds than on the voiced 
sounds would also explain why Couni- 
han found nasal emission perceived 
more often in the voiceless sounds than 
in their voiced counterparts. It would 
appear therefore that the magnitude of 
consonantal nasal pressure has an im- 
portant influence on the perception of 
audible nasal emission. 


It seems equally apparent that the 
more common misarticulations of cleft 
palate speakers are largely the conse- 
quence of an inability to effect adequate 
velopharyngeal valving during the pro- 





Hess, McDonald: Consonantal Nasal Pressure 209 


duction of those consonants that involve 
greater intraoral pressure requirements. 
It is reasonable to expect that if valving 
is inefficient the amount of nasal air 
escape should be related to the degree 
to which cleft palate speakers experi- 
ence difficulty in articulating consonant 
sounds. By percentage of misarticula- 
tions, McWilliams’ (7) subjects expe- 
rienced difficulty in articulation of 
affricates, fricatives, plosives, nasals, 
and glides, in the stated order; and the 
subjects of Spriestersbach, Darley, and 
Rouse (7) had their greatest percent- 
age of misarticulations on affricates, 
followed by fricatives, plosives, glides 
and nasals. Subtelny and Subtelny (/3) 
reported that their cleft palate subjects 
had a high incidence of errors on sibi- 
lants, but experienced few errors on 
glides. They also reported a signifi- 
cantly greater number of fricative 
errors than plosive errors. Counihan (3) 
and Bzoch (2) reported findings that 
the most common misarticulations of 
their cleft palate subjects involve con- 
sonants that are normally articulated 
with greater oral pressure. The findings 
in the present study therefore indicate 
that the misarticulations most com- 
monly noted in other studies of cleft 
palate speakers (2, 3, 7, J1, 13) are 
probabiy related to degree of nasal air 
escape. 


There is a possibility, however, that 
degree of nasal pressure is not the only 
determinant of perceptible nasal emis- 
sion. One of the subjects in the present 
study, on articulating [sa] syllables, 
exhibited as much as four ounces per 
square inch of nasal pressure, yet his 
articulation of the [s] sound was judged 
to be acceptable and free of perceptible 
nasal emission by both authors and an 


entire class of graduate students. This 
same observation may also apply to 
over-all nasal emission perceived in the 
speech of cleft palate speakers. Three 
experienced clinicians were asked to 
judge the connected speech of 10 sub- 
jects in the present study for audible 
nasal emission. When their judgments 
were compared with total nasal pressure 
for these subjects, there was no rela- 
tionship between the data for each of 
two judges and there was a substantial 
negative relationship between them for 
the third judge. Other factors that 
might affect the perception of nasal 
emission include (a) the particular 
nasal environment of the speaker (de- 
gree of patency of the nares and nasal 
passages, and extent of velar and pha- 
ryngeal tissue vibration on valving 
attempts); (b) the timing and coordina- 
tion of the valving process (extent of 
nasal air escape on the implosive phase 
of a plosive sound which might be 
identified as audible nasal emission be- 
cause of its temporal separation from 
the oral expulsion of air); and (c) the 
masking effect of the articulated con- 
sonant on the sound of the nasal air 
escape. 


Results in the present study indicate 
that the 10 consonants-in-trisyllable 
yielding nasal pressure readings for 
individual subjects which correlated 
most highly with average nasal pres- 
sure are, in the stated order: [6], [tf], 
[s], [43], [f], [fl (31 [zl [tl 
and [b] (Table 9). Since these 10 par- 
ticular consonants were among the 12 
consonants articulated with greatest 
average nasal pressure by all subjects, 
it seems reasonable that the nasal pres- 
sure of cleft palate speakers during the 
articulation of these consonants is re- 








210 Journal of Speech and Hearing Research 


lated to functional velopharyngeal valv- 
ing efficiency for speech. Further, it 
would appear that a test of the ability 
of cleft palate speakers to maintain 
adequate functional velopharyngeal 
closure during the articulation of a 
speech sample would logically include 
those consonants that have been found 
to be significant predictors of the rela- 
tive rank of any given individual’s 
average nasal pressure. This observation 
assumes, of course, that the test would 
involve nasal manometric readings. 


This does not negate the importance 
of wet spirometric and oral mano- 
metric measurements based upon con- 
ditions involving open and closed nares, 
as reported by Spriestersbach and 
Powers (12), for determining the ade- 
quacy of velopharyngeal closure. Nor 
does the present finding rule out the 
usefulness of lateral x rays in assessing 
the degree of closure effected during 
the sustained articulation of [u] (J2, 
13) and [s] (12). The present writers 
feel that wet spirometric and oral 
manometric measurements, as well as 
sagittal radiographic data, provide an 
indication of what might be termed 
velopharyngeal valving potential. There 
is no available evidence that either type 
of measurements indicates how well 
cleft palate speakers will effect velo- 
pharyngeal closure during connected 
speech. However, nasal manometric 
data based upon repeated syllabic artic- 
ulation indicate how well the individ- 
ual functionally effects velopharyngeal 
closure for speech. Therefore the 
sounds on which nasal pressure corre- 
lates most highly with average nasal 
pressure of cleft palate speakers should 
provide a useful basis for testing how 
well such subjects attain velopharyn- 
geal valving during speech. 


It is important to remember that nasal 
pressure need not be equated with nasal 
emission, particularly in the acoustic 
frame of reference within which 
nasal emission is often defined. ‘Nasal 
emission’ as an audible speech disturb- 
ance may or may not be evident in the 
speech of a person who exhibits a great 
amount of nasal pressure. It would 
appear that the diagnosis of perceptible 
nasal emission is a matter for clinical 
judgment, while nasal manometric 
readings should be useful in the assess- 
ment of velopharyngeal valving effi- 
ciency during the articulation of a 
syllabic speech sample. 


Summary 


Nasal manometric readings were ob- 
tained from 20 cleft palate speakers 
articulating CV monosyllables and 
CVCVCV trisyllables. The test in- 
cluded 24 consonants, each combined 
with a constant vowel, [a]. The fol- 
lowing conclusions appear defensible: 
consonants normally requiring greater 
intraoral pressure involve greater nasal 
pressure when articulated by cleft pal- 
ate speakers. Mean nasal pressure for 
consonant types ranks from highest to 
lowest as follows: affricates, fricatives, 
plosives, nasals, and glides. Voiceless 
sounds, considered as a group, involve 
greater nasal pressure than their voiced 
counterparts. Consonantal misarticula- 
tions in cleft palate speakers most 
frequently reported by different inves- 
tigators involve consonants with higher 
mean nasal pressure. Nasal pressure of 
individual subjects on 10 consonants- 
in-trisyllable tends to predict each 
subject’s rank in average nasal pressure. 
Nasal pressure readings for these con- 
sonants may provide diagnostic infor- 
mation about functional velopharyngeal 





Hess, McDonald: Consonantal Nasal Pressure 211 


valving for connected speech. How- 
ever, diagnosis of audible nasal emission 
remains a matter for clinical judgment. 


References 


1. Brack, J. W., The pressure component 
in the production of consonants. J. 
Speech Hearing Dis., 15, 1950, 207-210. 

2. Bzocu, K. R., An investigation of the 
speech of pre-school cleft palate chil- 
dren. Ph.D. dissertation, Northwestern 
Univ., 1956. 

3. Counmnan, D. T., A clinical study of the 
speech efficiency and structural adequacy 
of operated adolescent and adult cleft 
palate persons. Ph.D. dissertation, North- 
western Univ., 1956. 

4. Linnguist, E. F., Design and Analysis of 
Experiments in Psychology and Educa- 
tion. Boston: Houghton Mifflin, 1953. 

5. Mastanp, Mary W., Testing and correct- 
ing cleft palate speech. J. Speech Dis., 
11, 1946, 309-320. 

6. McDonatp, E. T., and Baxer, H. K., 
Cleft palate speech: an integration of 
research and clinical observation. J. 
Speech Hearing Dis., 16, 1951, 9-20. 

7. McWituraMs, Betry Jane, Articulation 
problems of a group of cleft palate 
adults. J. Speech Hearing Res., 1, 1958, 
68-74. 


Client-Centered Counseling 
with Adult Stutterers 


b> A long-range investigation is being carried 
on under the title of ‘Results of Client- 
Centered Counseling with Adult Stutterers.’ 
The design incorporates client-centered coun- 
seling as the therapeutic device, with therapy 
being administered in one-hour sessions once 
or twice per week. Tape recordings of three 
one-minute segments are randomly selected 
from therapy interviews at one-month inter- 
vals, and, at the conclusion of therapy, the 
excerpts are randomized and rated for severity. 
Rating is by a panel of judges, with severity 
being assessed on a seven-point scale. A 
frequency count of blocks is also included 
to assess changes in severity throughout the 
period of therapy. 


8. Mortey, Murer E., Cleft Palate and 
Speech. Baltimore: Williams and Wilkins, 
1954. 

9. Scorr, D. A., and Muusen, R., The ef- 
fectiveness of combined visual-auditory 
stimulation in improving articulation. J. 
Speech Hearing Dis., monogr. suppl. 4, 
1954, 51-56. 

10. Sern, G., and Gurnee, D., Speech in 
Childhood, Its Development and Dis- 
orders. London: Oxford Univ. Press., 
1935, 

11, Spriesrerspacn, D. C., Dartey, F. L., and 
Rouse, Verna, Articulation of a group 
of children with cleft lips and cleft 
palates. J. Speech Hearing Dis., 21, 1956, 
436-445. - 

12. SpriestersBacu, D. C., and Powers, G. R., 
Articulation skills, velopharyngeal closure, 
and oral breath pressure of children with 
cleft palates. J. Speech Hearing Res., 2, 
1959, 318-325. 

13. SusreLNy, JoANNE D., and Susrexny, 
J. D., Intelligibility and associated physi- 
ological factors of cleft palate speakers. 
J. Speech Hearing Res., 2, 1959, 353-360. 

14. Van Riper, C. R., Speech Correction, 
Principles and Methods. New York: 
Prentice-Hall, 1954. 

15. West, R. W., Kennepy, Lou, and Carr, 
Anna, The Rehabilitation of Speech. New 
York: Harper, 1947. 


B RESEARCH NEWS NOTE 


The eventual aim is to include at least 20 
subjects in the project. Subsequent departures 
in design might incorporate comparisons of 
one and two therapy hours per week; of 
individual and group therapy; of individual 
therapy and a combination of both indi- 
vidual and group therapy; of client-centered 
counseling as the only therapy device and 
counseling plus direct techniques. 

An attempt was made at the outset to in- 
clude analysis of client statements to determine 
degree of change in attitude throughout the 
period of therapy, but this was abandoned 
because of the difficulty in classifying client 
statements. An improved scaling technique 
to measure such change objectively is present- 
ly being developed. 


Robert F. Hejna, Ph.D. 
Director, Speech and Hearing Clinic 
University of Connecticut, Storrs 








Phoneme Perception in Lipreading 


MARY F. WOODWARD 


CARROLL G. BARBER 


The investigation reported here repre- 
sents the pilot experiment of a research 
program designed to apply the theory 
and method of modern structural lin- 
guistics to problems concerning the 
visual perception of speech by aurally- 
handicapped individuals. Specifically, 
the program has the following long- 
range objectives: (a) to develop a the- 
oretical model of the structure of 
perception in lipreading, that is, a defi- 
nition of the units of visual perception 
of oral-aural stimuli and of the relation- 
ships among these units in a system of 
oral-visual communication, and (b) to 
establish the relationship of the visually- 
perceived symbols to the underlying 
linguistic system. 

The general research design for this 
analysis of lipreading, constructed on 
the basis of linguistic theory and meth- 





Mary F. Woodward (Ph.D., University of 
California, Los Angeles, 1958) and Carroll 
G. Barber (M.A., University of Arizona, 
1952) are Research Associates, John Tracy 
Clinic and University of Southern California. 
This investigation was supported in part by 
Research Grant 17 from the Office of Voca- 
tional Rehabilitation, U. S. Department of 
Health, Education, and Welfare, and in part 
by the John Tracy Clinic. The general re- 
search program outlined here is being con- 
tinued with the support of funds from the 
Office of Education, U.S. Department of 
Health, Education, and Welfare, under the 
auspices of the John Tracy Clinic and the 
School of Education, University of Southern 
California. 


Volume 3, No. 3 


212 


od, calls for a series of experiments to 
test the visual perception of the de- 
scriptive units of structural linguistics, 
beginning on the phonological level and 
later proceeding to morphology and 
syntax. The initial experiment was 
designed to discover ‘units of lipread- 
ing’ comparable to the phonemes re- 
sulting from ordinary linguistic analysis. 
As the ultimate functional constituents 
of speech, phonemes are the lowest- 
order units of communicative value in 
a linguistic system. More explicitly, 
phonemes are the ultimate phonological 
constituents of an utterance; though 
they may be analyzed further into allo- 
phones, the differences among these are 
determined and nonfunctional. In these 
terms, the phoneme has also been con- 
sidered the minimal dynamic unit of 
language, as in Sapir’s (4, pp. 46-60) 
discussion of the ‘psychological reality 
of phonemes.’ 


Given the fundamental nature of the 
phoneme in any linguistic system, it 
may be anticipated that one component 
of lipreading comprehension should be 
statable in terms of the relative visibility 
of difference among phonemes. Pho- 
nemes may be classified in terms of the 
articulatory differences characteristic of 
a given phonological system. Articula- 
tory differences, and the visual percepti- 
bility of these, may constitute units of 


September 1960 





Woodward, Barber: Phoneme Perception in Lipreading 213 


lipreading at the simplest level of anal- 
ysis and perhaps at a basic level of 
visual speech perception as well. 


Experimental Design 


The pilot experiment was set up to 
test the visual discriminability of the 
phonemic contrasts among word-initial 
allophones of English consonants, as 
indicated by the perception of differ- 
ences between pairs of minimally-dis- 
tinctive nonsense syllables. Because the 
immediate goal was the discovery of 
minimum units and dimensions of dif- 
ference in lipreading, it was necessary 
to establish a context in which no re- 
dundant features would be possible. 
The stimulus materials, therefore, had 
to exclude the potential redundancy of 
situational cues present in normal con- 
versation, of allophonic variation, and 
of lexical or grammatical context. To 
satisfy these experimental aims, stimuli 
had to be presented as minimal pairs in 
a constant phonemic environment and 
in a nonredundant context. This par- 
ticular set of conditions can be achieved 
for English only with stimulus materials 
which consist of monosyllabic nonsense 
words. 


It is apparent that in larger utter- 
ances it would be difficult to impose 
adequate controls for contextual vari- 
ation. The lipreader might discriminate, 
for example, between the words ‘pill’ 
and ‘bill’ when these forms occur in 
such utterances as ‘he swallowed the 
pill’ and ‘he paid the bill,’ but it is not 
justifiable to infer from this that the 
lipreader has actually distinguished be- 
tween /p/ and /b/; he may merely 
have distinguished the contexts in which 
the two words occur. That the lipread- 
er can sometimes discriminate between 


forms distinguished by the voiceless- 
voiced consonants of English, then, 
does not necessarily mean that he can 
see the articulatory differences between 
them. That these consonants are visual- 
ly discriminable is an assertion some- 
times made by teachers of the deaf, 
however, and it has appeared as well in 
published research (6). A means of 
proving or disproving such assertions 
lies in the use of test materials which 
consist of minimally different stimuli. 

In view of these restrictions, stimuli 
for the pilot experiment were con- 
structed according to the formula 
CiV-C.:V (or C,V-C,V), for ex- 
ample, pa- ka (or pa- pa). The choice 
of the vowel was restricted to [a] and 
[>], the only vocalic nuclei of English 
normally occurring in final open syl- 
lables with primary stress without a 
diphthongization which might be hom- 
organic with another vowel phoneme 
(9, pp. 7-12). The low central [a] was 
selected because comparable [9] syl- 
lables yield several meaningful combi- 
nations. The use of closed CVC syl- 
lables would have resulted in even less 
control for meaning. 

The subject’s task in the experimental 
situation was to respond to the members 
of each syllable pair as ‘alike’ or ‘differ- 
ent.’ Analysis of these responses was 
designed to establish (a) a rank order 
of visual perceptibility of consonant 
phonations, and from this, (b) a hier- 
archy of visual contrastiveness among 
the phonetic differences which are as- 
sumed to be crucial in the aural per- 
ception of speech. 


‘Specifically, the vocalic nucleus selected 
is phonetically a long [a-], phonemicized as 
a simple vowel plus mid-central glide: /ah/ 
(1, pp. 33-39, 8). For the sake of simplicity, 


it is written here as -a. 








214 Journal of Speech and Hearing Research 


Experimental Sample. According to 


the requirement that every phoneme’ 


be in some way contrasted with every 
other in experiments of this kind (2, 
pp- 19-20, 7), materials were selected 
so that every classificatory dimension 
of the English consonantal system was 
represented in the sample. In other 
words, every criterion was to be tested 
against every other, though not all 
specific combinations resulting from the 
cross-cutting criteria were to be com- 
pared with all others. 


The following general classificatory 
criteria were used as a guide for pos- 
tulating a structure of visual relation- 
ships among consonantal articulations, 
and for selecting the specific pairs to 
be tested: (a) type of articulation (stop, 
affricate, spirant, resonant), (b) area 
of articulation (bilabial, labiodental, 
dental, alveolar, alveopalatal, velar, 
glottal), (c) resonance type (oral, na- 
sal), and (d) coarticulations (voicing, 
affrication, palatalization, labialization). 
With the exception of voice, the vari- 
ations included as coarticulatory phe- 
nomena could have been accounted for 
simply as combinations of difference in 
type and area of articulation. The addi- 
tional dimensions were included, how- 
ever, to make possible the examination 
of certain relationships which are not 
explicit in the basic classification. Still 
other distinctions characteristic of Eng- 
lish phonology (for example, aspiration 
and force of articulation) were omitted 
as explicit variables since they may be 
subsumed under the voiceless-voiced 
contrast in the consonantal environment 
used (initial position of a stressed syl- 


lable). 


In terms of the categories derivable 
from these criteria, every English con- 


sonant can be differentiated from any 
other; that is, any pair of consonants 
may be distinguished according to one 
or more of the phonetic dimensions sig- 
nificant in English phonology. By test- 
ing visual responses to the phonetic 
differences between selected consonant 
pairs, therefore, it should be possible 
to isolate critical dimensions of visual 
difference among these phonemes. The 
‘critical dimensions’ are those in terms 
of which articulations may be received 
visually and interpreted phonemically. 
More explicit rationalization for this 
procedure may be found in the theo- 
retical position of Prague School lin- 
guists who tend to equate dimensions 
of phonetic contrast with the phonemes 
of a language (3, 5, pp. 20-21). 
Stimulus Materials. In addition to the 
22 word-initial consonants listed in 
most phonemic analyses of English,’ 
three other syllable-initial positions 
were tested: /z/, /a/, and /W/. Al- 
though it does not occur initially in 
English, /z/ was included in the sample 
to complete the pattern of voiceless- 
voiced contrast, for example, /§ : z = t 
d/, and to permit a further check 
on the effect of palatalization on con- 
trastive visibility, for example, /z : z = 
s : 8/. Zero consonant initials, /9/, 
were included not only as a possible 
index of absolute visibility, but also to 


*The classification of English phonemes 
used in this study follows the analysis of 
Trager and Smith (8). Their phonetic sym- 
bols have been retained since they are also 
employed in other recent analyses of English 
phonology, for example, Gleason (1). Those 
which may be unfamiliar to some readers are 
listed below, followed by the IPA equivalents: 
é=th, jody, 8s =f, z=—=sy 
== j, W = hw. All other symbols have the 
same value as they do in the IPA, 





Woodward, Barber: Phoneme Perception in Lipreading 215 


provide hypothetical base forms for 
measuring coarticulatory phenomena, 
for example, palatalization /o: y =s: 
s/. For classificatory convenience, syl- 
lables beginning with zero, written aa, 
were regarded as voiced counterparts of 
h-syllables, ba. The consonant cluster 
/hw/ was treated as a unit voiceless 
phoneme, /W/, to allow such compari- 
sons as /h: W = @: w/. 


By varying these 25 initial conso- 
nants, as well as the order of syllables 
within each pair, 625 different syllable 
pairs can be derived, 25 of which con- 
tain phonemically identical members. 
Stimuli for the experiment consisted of 
229 of these pairs. To control for se- 
quence of presentation, each phonemi- 
cally-different pair of consonants 
selected was included in both possible 
orders, for example: pa—ka and ka—pa. 
Test materials included all identical 
pairs (25), all combinations with zero 
(49, including aa—aa), all voiceless- 
voiced comparisons (20, including ha— 
aa), and a selected sample of syllable 
pairs differing by one or more of the 
other classificatory criteria. The stim- 
ulus units presented in the test may be 
found in Table 1; the 102 pairs listed, 
doubled to provide for reversing the 
order of presentation, together with the 
25 identical pairs which are not in- 
cluded in the table, make up the total 
sample of 229. 


Hypotheses. On the basis of the 
phonetic dimensions outlined above, a 
general hypothesis of differential visibil- 
ity of phonation was set up: hypothesis 
1, absolute visibility of phonation is a 
function of area of articulation. Specifi- 
cally, it was postulated that a rank 
order of visibility could be set up as 


follows: (a) labials and labialized con- 
sonants /p b m f v W wr/, (b) alveo- 
palatals /¢ j § z y/, (c) alveolars and 
dentals /t dn 1s z 0 6/, and (d) velars 
and glottals /k g h o/. The ranking 
was not regarded as an equally descend- 
ing scale, but as a dichotomous one in 
which members of groups (a) and (b) 
would be highly visible, and those of 
(c) and (d) barely visible. This partic- 
ular division was made because in- 
formal pretesting showed a discernible 
amount of labial involvement in alveo- 
palatal articulations, and little or none 
for the alveolar, dental, velar, and glot- 
tal articulations. 


From the implications of the first 
hypothesis, a further hypothesis of dif- 
ferential visibility of contrast was de- 
rived: hypothesis 2, visibility of contrast 
is a function of relative visibility of 
phonation. Specific predictions were 
made by a classification of all the 625 
possible consonant pairs into three 
hypothetical categories of variation, re- 
presenting predicted response fre- 
quencies: visually contrastive, visually 
similar, and visually equivalent. ‘Con- 
trastive’ was a prediction that most sub- 
jects would call such pairs ‘different’; 
‘equivalent’ was a prediction that most 
subjects would call such pairs ‘alike.’ 
Pairs classified as ‘similar,’ on the other 
hand, express relations for which no a 
priori criterion justified a prediction of 
either contrastiveness or equivalence. 
Consequently, these pairs were assigned 
arbitrarily to an ambiguous range in 
which it was expected that comparisons 
would evoke both ‘alike’ and ‘different’ 
responses. Functionally, of course, such 
a classification is dichotomous, since 
‘similar’ pairs may be grouped with 
‘equivalent’ ones, as opposed to those 





216 Journal of Speech and Hearing Research 





Taste 1. Rank order of visual difference for pairs of English initial consonants. 














Rank Consonant Visual Rank Consonant Visual Rank = Consonant Visual 
Pair Difference Pair Difference Pair Difference 
Rating Rating Rating 
Contrastive 18 b-o 1.67 d-1 
1 W-s 1.98 f-o 47 g-h 38 
2 w-y 1.96 r-9 48 k-a 34 
3 p-§ 1.91 19 r- 1.64 49 w-r 32 
v-o9 20 C-o 1.61 50 é-t 23 
W-9 Z-8 51 t-k 17 
4 p-d 1.90 21 p-f 1.55 52 0-s .16 
f-§ 22 r-d 1.54 53 y-1 12 
wal 23 -v 1.48 54 d-n 09 
5 b-d 1.87 24 -2 1.44 55 y-s .06 
ren Similar Equivalent 
j-9 25 ¢-y 1.33 56 6-z — 02 
6 m-n 1.86 26 n-@ 1.32 57 d-g — .09 
7 r-g 1.84 27 l-o 1.20 58 n-] — ll 
z-2 28 z-y 1.14 59 t-n — .23 
8 p-t 1.82 29 V-9 1.05 60 p-b — 35 
w-h 30 S-s 1.04 61 b-m — 41 
9 r-] 1.81 31 f-r 1.03 62 j-d — .56 
t-o 32 é-k 1.02 k-g 
10 p-k 1.80 Z-Z 63 d-s — .57 
$-9 33 v-6 97 64 t-s — .60 
11 p-W 1.78 34 s-k .94 65 p-m — .66 
b-w 35 d-o .84 66 §-t — .76 
f-W 36 h-o 82 67 z-d — .80 
§-9 37 d-6 79 68 d-z — 81 
12 r-s 1.76 38 g-2 .78 69 f-v — .83 
13 V-Zz 75 39 t-8 nt 70 y-g — .84 
0-9 40 d-6 69 71 S-Z —1.04 
14 m-@ 1.74 41 p-2 64 72 0-6 —1.09 
f-k 42 j-y 61 73 t-d —1.11 
vV-w 43 §-Zz 55 74 j-z —1.22 
W-h d-h 75 é-j —1.40 
15 w-o 1.72 44 S-y 48 76 W-w —1.43 
16 r-t 1.71 45 k-h 44 77 §-Z —1.51 
17 rez 3h 1.68 46 ies 4B 78 i4-. 2s 








Woodward, Barber: Phoneme Perception in Lipreading 217 


pairs demonstrating articulatory differ- 
ences which have contrastive value in 
lipreading. 

Briefly, all pairs of homorganic con- 
sonants and all pairs of alveolar, dental, 
velar, and glottal consonants which are 
neither palatalized nor labialized, were 
expected to fall into the visually equiv- 
alent or visually similar categories. Ac- 
cordingly, consonants of groups (a) 
and (b) were expected to be character- 
ized by greater differentiation than 
those of groups (c) and (d), that is, 
they were expected to participate in 
more pairs labeled visually contrastive. 
Since the hypothesis predicted the de- 
gree of visual difference between mem- 
bers of all possible pairs of consonant 
initials, it was possible to select the 
particular pairs to be tested to be repre- 
sentative of the phonetic dimensions 
discussed earlier. In comparison with 
the dimensions which define differences 
in auditory reception of speech, these 
postulated visual relationships among 
English consonants indicate a marked 
leveling of distinctions. Dimensions of 
articulatory type, resonance, voice, 
affrication, and several areas of articula- 
tion which are functionally different 
in English phonology would not, in 
terms of these hypotheses, have pho- 
nemic value in the visual perception of 
English speech. 


Experimental Procedures 


Because uniformity in the presenta- 
tion of stimuli was considered desirable, 
test materials were filmed for presenta- 
tion to the subjects. The syllable pairs 
selected were alphabetized according 
to the classificatory order established 
in the hypotheses and were then ran- 
domized by means of a table of random 


numbers. A film script was prepared, 
utilizing the random order list of syl- 
lable pairs numbered from 1 to 229. 
The test was recorded on black and 
white sound film, showing a face-on 
view of the speaker, including the head 
and shoulders.* Although there are rea- 
sons for suspecting that the direct full- 
face view is not always the most criti- 
cal one for perceptibility in lipreading, 
this is the view traditionally recom- 
mended for lipreaders, and it was selec- 
ted for that reason. 


There were five experimental and 
two control groups, a total of 305 sub- 
jects. The test was administered in the 
following ways: to the five experi- 
mental groups, 185 subjects, the film 
without sound; to one control group, 
65 subjects, the sound track alone but 
with item numbers shown on the 
screen; to the other control group, 55 
subjects, the complete film with sound. 
All subjects were supplied with num- 
bered answer sheets on which they 
could indicate whether they thought 
the members of each syllable pair were 
‘alike’ or ‘different.’ The three test 
conditions, as well as the groups taking 
the tests, are henceforth referred to as 
visual, auditory, and audiovisual. 

With the exception of the first two 
visual groups, subjects were university 
undergraduates, and the film was 
shown in university classrooms. The 


*The speaker for the film test was Judith 
Joel, a graduate student in Anthropology and 
Linguistics at the University of California, 
Los Angeles. Miss Joel normally speaks a 
variety of Standard Midwestern American 
English; her linguistic training made it possi- 
ble for her to pronounce repetitions of the 
same syllable with a high degree of phonetic 
and visual uniformity. The film was produced 
by the Cinema Department, University of 
Southern California. 








218 Journal of Speech and Hearing Research 


two groups mentioned, totaling 38 sub- 
jects, were interested in the education 
of the deaf and included individuals 
with some experience in lipreading. 
The group scores, however, showed no 
significant differences from those of the 
university undergraduates. 

Subjects were selected on the basis of 
two main criteria. Inasmuch as the in- 
vestigation was designed to discover 
linguistically-determined units of visual 
speech perception, normal-hearing 
speakers of English were necessary. 
Individuals with hearing impairments, 
though perhaps better lipreaders, were 
not considered appropriate subjects 
since they would to some degree lack 
the sensory experiences of hearing- 
speaking exchange. The other criterion 
was that of willingness to take the test, 
which was satisfied by using volunteer 
subjects. Since the test was both long 
and demanding of constant attention, 
it was hoped that such subjects would 
be more likely to respond consistently 
for the duration of the test. 


Analysis 


Techniques for application of the 
theoretical model were developed in a 
preliminary analysis (10) of data col- 
lected from the first two visual groups 
of 38 subjects and were then applied 
to responses obtained from the total 
experimental sample of 185 subjects. It 
is perhaps a reflection of the systematic 
nature of linguistic behavior and, con- 
sequently, of the aptness of a linguistic 
model for lipreading research, that no 
major changes in results accompanied 
the almost five-fold increase in the num- 
ber of subjects tested. 

As an initial check, an analysis of the 
average number of correct responses 


for each interval of 10 stimulus units 
revealed no significant differences in 
responses to the syllable pairs presented 
during the latter part of the test as 
compared with earlier responses. It was 
assumed, therefore, that the results were 
not biased by learning during the 
course of the test. 


To rank the consonant pairs tested in 
a descending scale of visual discrimi- 
nation, each of the 229 syllable pairs 
was rated in terms of the difference be- 
tween the number of subjects who 
called the two members ‘alike’ and the 
number who called them ‘different.’ 
Thus, a discrimination value for each 
pair was derived by subtracting the per- 
centage of ‘alike’ responses from the 
percentage of ‘different’ responses. Dis- 
crimination values for reciprocal pairs 
were then added: together to yield a 
visual difference rating for that particu- 
lar consonant combination. For ex- 
ample, eight subjects called the syllable 
pair pa—ka ‘alike’ and 177 called it ‘dif- 
ferent,’ 4% and 96%, respectively, giv- 
ing a discrimination value of .92. For 
the pair ka—pa, 11 subjects called it 
‘alike’ and 174 ‘different,’ 6% and 94% 
respectively, giving .88. The visual dif- 
ference rating for the consonant pair 
/p—k/, then, is 1.80 (Rank 10, Table 
1). Visual difference ratings for all 
combinations tested were then placed 
in a rank order of visual difference. 
Pairs having two identical syllables 
were omitted from this analysis since 
they had no reciprocal stimuli and were 
presented only once in the test. 


Although an arrangement of visual 
difference ratings in a descending scale 
would extend theoretically from +2.00 
to —2.00, actual results show a range 
from +1.98 to —1.56. This range was 





Woodward, Barber: Phoneme Perception in Lipreading 219 


divided into three major classes which 
correspond roughly to the differential 
categories established in the hypothesis: 
(a) contrastive, ratings from +1.98 to 
+1.44; (b) similar, from +1.33 to 
-+-.06; and (c) equivalent, from —.02 to 
—1.56. This division was made pri- 
marily on the basis of phonetically com- 
parable relations, since in most cases 
it was possible to include within one 
group such pairs as /b—d/, +-1.87; /p 
—t/, +1.82; and /m—n/, +1.86; these 
pairs include bilabial and alveolar con- 
sonants of comparable articulatory 
type. This analysis may be justified 
statistically by virtue of a division of 
the data into an approximately equal 
number of ranks among the three cate- 
gories and because of the relative values 
of the mathematical intervals separating 
them. The specific consonant pairs in- 
cluded in each of the classes are given 
in Table 1, which shows 44 pairs la- 
belled contrastive (ranks 1-24), 34 
similar (ranks 25-55), and 24 equiva- 
lent (ranks 56-78). 


Results 


On the basis of this analysis of the 
data, the hypotheses of absolute and 
relative visibility of English initial con- 
sonants were assessed. The degree of 
articulatory differentiation, decreased 
by the hypotheses in moving from au- 
ditory to visual conditions of reception, 
was reduced to an even lower level by 
the results of the study. Of those pho- 
netic dimensions which define the 
significant articulatory differences in 
English speech, almost all—including 
articulation type, resonance type, voice, 
affrication, palatalization, and all areas 
of articulation except the labial—are 


virtually neutralized as factors of dif- 
ference in visual perception. 

In terms of these data, the following 
four sets of English consonant initials 
can be classified as visually contrastive: 

Unit 1: pbm 

Unit 2: Wwe 

Unit 3: fv 

Unit 4: tdn16d5széjszykgh 
The four units may be categorized 
briefly as bilabial, rounded labial, labio- 
dental, and nonlabial, respectively. 
While these units contrast visually with 
each other, they are internally homoph- 
enous, that is, the members of each 
unit look alike to the lipreader. While 
reference to Table 1 shows that /f/ 
was confused with /r/ (Rank 31) and 
/v/ with /6/ (Rank 33), Unit 3 is 
regarded as distinctive because /f/ and 
/v/ contrast with all other members of 
the other three units. 


Even though Unit 4 was set up on the 
basis of visually-similar as well as visu- 
ally-equivalent relationships, it is 
apparent that lipreaders cannot dis- 
criminate consistently among the indi- 
vidual phonemes included. If lipreaders 
are to distinguish these various types of 
phonemes—alveolar, dental, alveopalatal, 
velar, and glottal—it must be on the 
basis of phonetic, lexical, or gram- 
matical redundancy, since the articula- 
tory differences among them are not 
readily available to visual observation. 


Although responses to pairs con- 
taining a zero syllable (aa) were not 
diagnostic of a differential absolute 
visibility of consonant phonemes, other 
syllable-pair comparisons permitted an 
analysis of similar order. From the dis- 
tribution of consonant pairs among the 
three major categories of contrastive, 
equivalent, and similar, a high visibility 





220 Journal of Speech and Hearing Research 


value can be inferred for the labial and 
labialized consonants (Units 1, 2, and 
3 above) since they contrast with all 
the members of all other units. Further, 
except for the zero combinations, no 
nonlabial consonant participated in a 
visually contrastive pair unless the pair 
also included a labial or labialized con- 
sonant. The fact that only labial articu- 
lations are discriminated consistently 
seems to justify the term ‘lipreading’ 
for this level of activity in visual speech 
reception. 

Perhaps the most notable failure 
among the predictions of differential 
visual perceptibility of consonant pho- 
nemes is the refutation of the high 
visibility which the hypothesis assigned 
to the alveopalatal consonants, and their 
participation, consequently, in fewer 
visually contrastive pairs than was pos- 
tulated. Not only are these phonemes 
distinguished from alveolar consonants 
to a much lesser degree than was pre- 
dicted, but also they constitute among 
themselves a virtually homophenous 
unit. It is possible that lip movement 
in palatalization might be more visible 
with a profile view of the speaker, and 
a test of this hypothesis has been de- 
signed for future research. 


Comparison of Experimental and 
Control Data 

Criteria used in assigning specific 
consonant pairs to the visually equiva- 
lent, similar, and contrastive categories 
have been outlined above. Comparable 
techniques were used in processing 
auditory and audiovisual data, except 
that the boundaries were taken directly 
from those established in the analysis 
of the visual data. Because it was antici- 
pated that the auditory and audio- 
visual control groups would achieve 


TasBLE 2. Distribution of consonant pairs among 
the categories of perceptual difference according 
to stimulus condition: visual (1), auditory (2), 
audiovisual (3). 











Categories Stimulus Condition 
1 2 3 

Noncontrastive 
Equivalent 24 + ul 
Similar 34 19 16 
Total 58 23 17 
Contrastive 44 79 85 
Total Pairs Tested 102 102 102 








‘perfect’ scores (except for errors due 
to random distractions in the test situa- 
tion), no specific hypotheses were for- 
mulated in advance of actual testing. 
In view of the assumed validity of the 
analytic units of linguistic methodol- 
ogy, in fact, there would have been 
no logical justification for postulating 
the differential dimensions of percepti- 
bility which were actually obtained. 

Of the 102 possible distinctive pairs 
of consonants represented in the syl- 
lable pairs in the stimulus materials, 44 
proved to be visually contrastive, 79 
aurally contrastive, and 85 pairs came 
within the contrastive range under 
audiovisual conditions of reception. 
These distributions are specified in 
Table 2. These results seem to indicate 
that phonemic distinctions are most 
easily perceived under audiovisual con- 
ditions, that they are slightly less per- 
ceptible from auditory cues alone, and 
that fewer than half of these distinctions 
are perceptible visually. This gross dis- 
tribution might easily have been as- 
sumed, of course, before the phonemes 
were actually subjected to perceptual 
testing, but beyond this, the experiment 
yielded characteristic perceptual pat- 
terns for the three conditions of recep- 
tion. 





— 








Data for the auditory and audiovisual 
control groups comparable to the visual 
data given in Table 1 have been omitted 
from this report, however, since devia- 
tions from the scores expected may 
have been a function partly of the low- 
fidelity sound reproduction in the test 
film. Although this possibility may 
lessen the validity of generalizations 
about auditory and audiovisual percep- 
tion of phonemes, the results are sug- 
gestive, and they are useful as 
comparative data. It is worth noting, 
further, that under less-than-optimum 
conditions of auditory reception, per- 
ceptual confusions are more likely to 
occur among some phonetic stimuli 
than among others (7). 


Discussion 


Although not many conclusions of 
immediate practical importance to 
speech perceptibility can be drawn 
from a test constructed entirely of non- 
sense syllables, the pilot study of the 
linguistic-model investigation of lip- 
reading processes has provided a valu- 
able control device for future testing 
at higher levels of structure. Differ- 
ential visibility of consonantal articula- 
tions has been measured at a phonetic- 
phonemic level and can now be taken 
into account in examining components 
of the visual perception of larger, 
meaningful utterances. Conversely, later 
experiments utilizing more complex 
stimuli need not be contaminated by 
uncontrolled phonetic data nor com- 
plicated by untestable phonological 
hypotheses. 

With respect to the four visually 
contrastive units derived from the test 
data, it is difficult to conceive of a use- 
ful decoding system based on only four 





Woodward, Barber: Phoneme Perception in Lipreading 221 


units of signal reception when the code 
itself is structured in terms of 24 sig- 
naling units. It is apparent, however, 
that congenitally deaf children do, in 
fact, learn language and that they do 
acquire speech which is based on the 
larger, more flexible system of aurally 
perceptible differences which represent 
English phonology. Although for the 
severely deaf, reception of speech stim- 
uli is practically limited to lipreading, 
the data of this study suggest strongly 
that the perception of speech by such 
individuals (and probably by normal- 
hearing individuals as well) involves 
much more than an ability to make 
judgments—phonetic or visual, con- 
scious or unconscious—about articula- 
tory movements and the visual or 
acoustic results of these. Though lip- 
reading must be the point of transfer 
from the oral to the visual in speech 
reception, it must be in the nature of 
language itself as a meaningful, func- 
tioning system that the determinants of 
speech perception will eventually be 
found, even for the perception of 
phonemic-level distinctions among 
speech components. 


Summary 


A research program has been set up 
to apply the theory and method of 
structural linguistics to an analysis of 
lipreading processes. As the first step, 
perceptual differences among English 
initial consonants were tested. Stimulus 
materials consisted of pairs of phonemi- 
cally identical and minimally different 
nonsense syllables, which provided a 
constant, nonredundant linguistic en- 
vironment for the phonemes tested. 
Stimuli were presented to 185 experi- 
mental subjects, normal-hearing adult 





222 Journal of Speech and Hearing Research 


speakers of English, by means of a 
silent film. The test was administered 
also to smaller control groups by pre- 
senting the sound track alone, and by 
showing the complete film with both 
picture and sound. 


In place of the 24 initial consonants 
tested, results indicate that only four 
visually-contrastive units are available 
consistently to the lipreader. Though 
control group scores were not perfect, 
they were in accord with present 
knowledge about perceptual confusions 
among speech sounds in nonredundant 
environments and under less-than-opti- 
mum conditions of reception. 


Acknowledgment 

The cooperation and assistance of Dr. 
Harry Hoijer, Professor of Anthropol- 
ogy, University of California, Los An- 
geles, and of Dr. Edgar L. Lowell, 
Administrator, John Tracy Clinic, Los 
Angeles, are gratefully acknowledged. 


References 


1. Greason, H. A., An Introduction to De- 
scriptive Linguistics. New York: Holt, 
1955. 


Listening Training 
for Mentally Retarded Children 


» A hearing testing project is planned in 
which the subjects will be mentally retarded 
children who do not respond normally to 
pure-tone audiometric sweep testing. For 
those children who do not meet the criteria 
selected to define a significant hearing loss a 
listening training program will be instituted. 
The project thus is designed to add the ele- 
ments of depth, time, and training to the 
hearing testing of a retarded population. A 
selected battery of hearing tests will be re- 
peated at three spaced intervals within the 
continuing listening training program. 


2. Horjer, H., Linguistic methodology and 
its application to lip reading: a prelimi- 
nary report. In Mary F. Woodward, 
Linguistic Methodology in Lip Reading 
Research, No. IV in John Tracy Clinic 
Research Papers. Los Angeles: John 
Tracy Clinic, 1957. 

3. Jaxosson, R., Fant, C. G. M., and HAtte, 
M., Preliminaries to speech analysis, 
Technical Report No. 13. Cambridge: 
MIT Acoust. Lab., 1952. 

4. Manpetpaum, D. G., (Ed.), Selected 
Writings of Edward Sapir in Language, 
Culture and Personality. Berkeley and 
Los Angeles: Univ. California Press, 1949. 

5. Martinet, A., Phonology as Functional 
Phonetics; Three Lectures Delivered be- 
fore the University of London in 1946. 
London: Oxford Univ. Press, 1949. 

6. Mason, Marie K., A laboratory method 
of measuring visual hearing ability. Volta 
Rev., 34, 1932, 510-514. 

7. Mirier, G. A., and Nicery, Patricia E., 
An analysis of perceptual confusions 
among some English consonants. J. 
acoust. Soc. Amer., 27, 1955, 338-352. 

8. Tracer, G. L., and Smitn, H. L., Jr., An 
outline of English structure. Stud. Ling. 
occ. Pap., 3, 1951. 

9. Wuorr, B. L., Linguistics as an exact 
science. In Four Articles on Metalinguis- 
tics. Washington: Foreign Serv. Inst., 
US. Dept. State, 1949. 

10. Woopwarp, Mary F., Linguistics Meth- 

* edology in Lip Reading Research. No. 
IV in John Tracy Clinic Research Papers. 
Los Angeles: John Tracy Clinic, 1957. 


B RESEARCH NEWS NOTE 


The expected end product will indicate 
which test or combination of tests offers the 
most valid and reliable means of determining 
the hearing capacity of mentally retarded 
children. The effectiveness of the listening 
program will be evaluated by comparing 
audiometric thresholds. Implications from the 
listening program might be projected into a 
listening program for normal young children. 

This is a cooperative research project with 
the Office of Education, Department of 
Health, Education, and Welfare. 


Bernard B. Schlanger, Ph.D. 
Director, Speech and Hearing Clinic 
West Virginia University, Morgantown 





Sugar Placebos and Stuttering 


JAMES R. PALASEK 


W. SCOTT CURTIS 


Along with the increasing availability 
of pharmaceutical products for the 
treatment of psychological disorders 
comes a need for research into the ap- 
plicability of chemotherapeutic tech- 
niques in certain areas of speech 
correction. A recent article by Burr and 
Mullendore (/) summarized six studies 
of the effects of certain tranquilizing 
drugs on stuttering. These studies as 
well as those of Love (5, pp. 248-310) 
and Hale (2) who investigated other 
chemotherapies for stuttering, have in- 
corporated as part of their experimental 
design the use of a placebo. 


The term ‘placebo’ has been defined 


from a clinical point of view as ‘an 
inactive substance given to satisfy a 
patient’s demand for medicine’ (7, sect. 
P, p. 61). Such definition seems to be in- 
appropriate in the research context used 
by the above authors. The following 
operational definition is suggested: Ae 
placebo is, for purposes of pharmacolog- 
ical experimentation, a chemical com- 
pound which is identical in appearance 
and method of administration to the 
pharmaceutical agent under investiga- 





James R. Palasek (M.A. Purdue University, 
1960) is a Clinical Assistant and W. Scott 
Curtis (M.A. Western Michigan University, 
1957) is a Research Assistant, Purdue Uni- 
versity Speech Clinic. 


Volume 3, No. 3 


223 


tion, but which has no known physio- 
logical effect on the subject relevant to 
the behavior being measured. 

On the basis of such definition, the 
use of lactose (a sugar form) as a 
placebo in some of the above studies 
may be questioned. This is particularly 
true in light of the findings of Kopp (4) 
and Seeman (6) which suggest a rela- 
tion between stuttering and increased 
blood sugar content. 

The following procedure was used 
to determine whether alterations of 
stuttering, if any, which follow the ad- 
ministration of a so-called placebo, 
differ for.sugar (lactose) and nonsugar 
(calcium carbonate) compounds. 


Procedure 


Subjects were eight male and one 
female stutterers who were enrolled for 
treatment in the Purdue University 
Speech Clinic. All stutterers demon- 
strated a minimum of 10 stuttering 
blocks on a pretest reading of a 110- 
word passage. Three capsules of iden- 
tical appearance, each containing a 
white chemical compound, were ad- 
ministered. One capsule contained 11 
grains of lactose; a second capsule con- 
tained 11 grains of calcium carbonate; 
a third capsule contained 5.5 grains of 
each. All capsules were prepared by the 


September 1960 








224 Journal of Speech and Hearing Research 


University Health Service with the 
approval of a staff physician. 


Testing was carried out over a seven- 
day period with each subject presenting 
himself for testing on four alternate 
days. Subjects were instructed to ab- 
stain from food for each of the 10-hour 
periods preceding test days. Each day’s 
testing consisted of five readings of the 
standard 133-word passage ‘My Grand- 
father’ with a 10-minute interval be- 
tween readings. The interval allowed 
for the study of effects as the chemical 
was absorbed. 


On the first day all subjects read for 
a control condition with no capsule ad- 
ministered. On each of the three suc- 
ceeding test days, each subject was 
given one of the capsules as randomly 
prescribed, immediately preceding that 
day’s first reading. The capsules were 
administered under double-blind condi- 
tions, with the contents of any one cap- 
sule unknown to either the subjects or 
the experimenters. 

The subjects were instructed that 
they would be required to remain in 
the testing room for approximately one 
hour; that at the beginning of the hour 
they would be given a pill to take; and 
that at certain intervals throughout the 


TaBLe 1. Summary of analysis of variance for 
number of stuttering blocks in five readings for 
a control (no placebo) and three experimental con- 
ditions (placebos: lactose and calcium carbonate). 











Source df ms F Fos 
Readings (R) 4 35.08 .29. 2.67 
Conditions (C) 3 437.28 1.87 3.01 
Subjects (S) 8 1401.69 
RC 12 6.93 10 =1.86 
RS 32 119.96 
CS 24 234.90 
RCS 96 69.94 

Total 179 








hour they would be asked to read aloud 
a brief passage. They were requested to 
sit quietly in the testing room between 
readings. Immediately preceding the 
first reading, the subject was instructed 
not to use any ‘control techniques’ he 
might have learned to reduce his stut- 
tering. 

Each reading was recorded with the 
subject’s knowledge on a VM 710 re- 
corder in an acoustically treated ther- 
apy room. No listeners were present. 

Stuttering frequency was determined 
from the test recordings by the two 
experimenters listening independently. 
Following the initial judgements, one 
experimenter read aloud the words he 
had marked as stuttered. When the 
second experimenter’s judgement dis- 
agreed with that of the first on a partic- 
ular word, the word was listened to 
again until agreement was reached. The 
experimenters were unaware during the 
judging procedure of the condition in 
which any one reading had taken place. 


Results and Discussion 


The data, which consist of frequency 
counts of the number of words stut- 
tered, were evaluated by means of an 
analysis of variance (3, p. 217) with 
three factors: readings, conditions, and 
subjects. The results, reported in Table 
1, provide no evidence of real differ- 
ences among conditions or among read- 
ings. The interaction between readings 
and conditions is also nonsignificant. 

aThere is thus no basis for a conclusion 

that the placebos had a differential ef- 
fect or any effect at all upon frequency 
of stuttering. 

Although obtained differences are 
nonsignificant, there appears to be some 
consistency in the pattern of differences 





Palasek, Curtis: Sugar Placebos and Stuttering 225 


TABLE 2. Percentage of reduction of mean stut- 
tering blocks from the control condition (no 
placebo) for each of the experimental conditions 
(placebos: 11 grains lactose, 11 grains calcium 
carbonate, or 5.5 grains each) throughout five 
readings. 











Capsule Reading 
2 3 4 5 
Lactose 52 61 52 47 47 
Calcium 


Carbonate 34 36 33 31 35 


Lactose and 
Calcium Car- 
bonate 41 31 37 25 44 








between the control condition and the 
experimental conditions. Presented in 
Table 2 are percentages indicating re- 
duction of mean frequency of stutter- 
ing between the control condition and 








1S} 
” 
x - 
4 A 
J +L 
re) 
o +f 
= 
« Q 
ae 
2 OL oo eee, 
: ee 3 Yr~08 
=. \ 
3 ‘ a ¢ 
5 % ? “py 
SS or 
al s ae 
»- 
5- 
oa 
0 iL i i. i A 
1 2 3 4 5 
READINGS 


Figure 1. Mean number of stuttering blocks 
for each of five readings in the control (no 
placebo) and experimental (placebos of lac- 
tose and calcium carbonate) conditions: A 
= control condition, B = 11 grains calcium 
carbonate, C = 5.5 grains lactose and 5.5 
grains calcium carbonate, D = 11 grains 
lactose. 


each experimental condition for each 
of the five readings. Frequency of stut- 
tering was reduced for each of the 15 
comparisons. Although no differences 
are significant, it seems possible that this 
consistency should not be explained on 
the basis of chance. It was assumed that 
the spacing between conditions was 
sufficient to eliminate reduction of stut- 
tering attributable to adaptation after 
the control reading. It is impossible, 


however, to state with any certainty 


that adaptation was not a factor. The 
reductions in stuttering frequency 
might also be explained as the result of 
the experimental conditions, that is, the 
use of the three placebos. The largest 
reduction is for the condition in which 
the lactose placebo was used. The find- 
ings of this study do not, of course, 
contraindicate the use of lactose as a 
placebo in pharmacological research 
with stutterers, but they do suggest the 
possibility that further research might 
indicate that the use of nonsugar place- 
bos is to be preferred. 


The mean stuttering frequencies for 
each reading in each condition are 
shown graphically in Figure 1. As al- 
ready stated, neither the differences 
between conditions nor the differences 
between readings are statistically sig- 
nificant. The consistently higher fre- 
quency of stutterings in the control 
condition, discussed above, is, however, 
readily apparent. Also, the expected 
adaptation from the first reading 
through the fifth reading, is indicated. 
The most obvious departures are for 
the conditions employing lactose, most 
particularly for the condition employ- 
ing lactose alone. In any case, in view 
of the fairly consistent patterns of dif- 
ferences for readings and for conditions, 





226 Journal of Speech and Hearing Research 


and since these differences are nonsig- 
nificant, it may be said that the indi- 
vidual subjects are extremely variable. 
Also, the variances of effects involving 
subjects (see Table 1) are relatively 
large. Perhaps further experimentation 
with a larger number of subjects would 
yield significant results. 

Further investigations of the placebo 
technique in pharmacodynamics are in- 
dicated: an explanation for the apparent 
alteration of adaptation following the 
administration of lactose, a study of 
the use of true placebos as a test for 
suggestibility, and an investigation of 
the effects of lactose which is admin- 
istered without the knowledge of the 
subjects. 

In more general terms, these other 
research needs are indicated: formula- 
tion of an experimental testing method 
free of the need for placebo type con- 
trols, and consideration of methods for 
the application of placebo technique to 
nonpharmacological studies of stutter- 
ing. 

Basic to the above suggestions is the 
need for an acceptable definition of 
‘placebo’ for research purposes. 


Summary 


The use of lactose placebos in stut- 
tering research was investigated. Nine 
stutterers read a short passage five times 
under a control condition and under 
three experimental conditions after tak- 
ing placebos: (a) 11 grains of lactose, 


(b) 11 grains of calcium carbonate, 
and (c) a combination of 5.5 grains of 
lactose and 5.5 grains of calcium car- 
bonate. 


The criterion measure was frequency 
of stuttering. By statistical analysis no 
significant differences were found either 
among readings or among conditions. 


Further research with more subjects 
is suggested since the observed differ- 
ences indicate the possibility that stut- 
tering frequency might be decreased 
by the administration of lactose. 


References 


1. Burr, HeLen G., and Muttenpore, J. M., 
Recent investigations on tranquilizers and 
stuttering. J. Speech Hearing Dis. 25, 
1960, 33-37. 

2. Hate, L. L., A consideration of thiamin 
supplement in: prevention of stuttering in 
preschool children. J. Speech Hearing 
Dis., 16, 1951, 327-333. 

3. Kemprtuorne, O., The Design and Analy- 
sis of Experiments. New York: Wiley, 
1952. 


4. Kopp, G. A., Metabolic studies of stutter- 


ers: I. Biochemical study of blood com- 
position. Speech Monogr., 1, 1934, 117- 
132. 

*5, Love, W. R.,,The effect of pentobarbital 

~  (nembutal) and amphetamine sulphate 
(benzedrine) on the severity of stutter- 
ing. In W. Johnson and R. R. Leuten- 
egger (Eds.), Stuttering in Children and 

dults, chap. 23. Minneapolis: Univ. 
Minnesota Press, 1955. 

6. SzeMAN, M., Contribution of the patho- 
genesis of stuttering. Psychol. Abstr., 11, 
1937, 399-404. 

7. Taser, C. W., Cyclopedic Medical Dic- 
tionary. Philadelphia: Davis, 1953. 





lo» lll a We’; i ae a ee ee, |, 2), ee ee eee 


— 


nN 
a 
il 
al 
b 
Ir 
g 
D 
J 





Cinefluorographic Techniques in Speech Research 


KENNETH L. MOLL 


Two types of radiographic procedures 
have been utilized in the study of the 
speech mechanism (10): single-expo- 
sure techniques and cineradiographic 
techniques. 

Such single-exposure x-ray proce- 
dures as standard lateral-head radiogra- 
phy and laminagraphy have been 
utilized in speech research by various 
investigators (2, 11, 22, 13, 24, 28). Al- 
though these techniques provide pic- 
tures which are sufficiently clear and 
detailed for satisfactory measurement 
of the positions of speech structures, 
they have two basic limitations as 
speech research procedures: (a) pic- 
tures can be taken only during the 
production of isolated, sustained speech 
sounds where the structures are held 
immobile for the duration of the ex- 
posure, and (b) the pictures provide 
only a single sample of articulatory 
positions during the production of 
speech sounds. Since the physiological 
characteristics of speech sounds have 
been shown to vary with phonetic con- 





Kenneth L. Moll (Ph.D., University of 
Iowa, 1960) is Research Associate, Depart- 
ment of Speech Pathology and Audiology, 
and Department of Otolaryngology and Max- 
illofacial Surgery, University of Iowa. This 
article is based on a doctoral dissertation com- 
bars under the direction of Professor James 

. Curtis. This investigation was supported 
by research grant M-1158 from the National 
Institute of Mental Health and by research 
grant D-853 from the National Institute of 
Dental Research, Public Health Service. 


Volume 3, No. 3 


227 


text (23, 27), and since acoustic studies 
have demonstrated that sounds are time- 
varying events (6), it appears that 
single-exposure x-ray procedures im- 
pose severe limitations on the general- 
ization of results to physiological 
positions in connected speech. 

Cinefluorography with electronic im- 
age intensification, in which motion 
pictures are taken from a fluoroscopic 
screen,’ appears to be a promising tech- 
nique for speech research. Since it does 
not require the static conditions neces- 
sary in single-exposure procedures, it 
can utilize connected speech samples 
rather than only sustained sounds, and 
it will provide more than one cross- 
sectional time sample during speech 
sounds. Thus movements, as well as 
static, maintained positions, may be 
studied. 

If cinefluorographic techniques are 
to be utilized effectively for research 
studies of speech production, it appears 
necessary that information contained in 
the film be reliably extracted and ex- 
pressed in a quantitative form so that 
statistical comparisons can be made. 

Although a number of cinefluoro- 
graphic studies of the speech mecha- 
nism have been reported (4, 7, 15, 17, 
21), in none of these studies were at- 


*An extensive bibliography of literature 
dealing with the principles and the develop- 
ment of cinefluorography can be found in 
reference 22, pages 244-262. 


September 1960 





228 Journal of Speech and Hearing Research 


tempts made to express structural posi- 
tions and movements in quantitative 
form. Other investigators (1, 3, 8, 9), 
utilizing frame-by-frame tracing pro- 
cedures to analyze cinefluorographic 
film of activities other than speech, also 
did not quantify the information from 
the films. 

The present study was designed to 
investigate the methodological prob- 
lems involved in using cinefluoro- 
graphic techniques for studies of the 
physiological characteristics of speech 
articulation. More specifically, the pur- 
poses of this study were to determine 
procedures for obtaining clear cine- 
fluorographic pictures of the speech 
articulatory structures with tolerable 
x-ray dosages, to investigate the relia- 
bility of procedures for quantitatively 
extracting cinefluorographic informa- 
tion, and to carry out a demonstration 
study using the techniques developed. 


Cinefluorographic Equipment 


The cinefluorographic equipment 
utilized in the present study (Figure 1) 
was designed by the North American 





Philips Company. The primary com- 


ponents of the assembly are A, a Rota- 
lix, 125 kvp x-ray tube with a line focus 
of 0.3 mma’; B, a Philips five-inch image 
intensifier tube with an intensification 
factor of approximately 1000; C, a 
fluoroscopic viewer; and D, an Auri- 
con, 16 mm camera. 


Power is supplied to the x-ray tube 
by a Monoray 100 kvp generator. The 
generator, when activated, provides 
constant radiation; operation is not 
intermittent and synchronized with the 
camera shutter as in some cinefluoro- 
graphic equipment. 

Descriptions of image intensifier 
tubes are available in the literature 
(5, 20); however, one aspect of the 
Philips five-inch tube should be men- 
tioned since it will be referred to later. 
The current is measured by the photo- 
cathode and indicated on a direct-cur- 
rent millivolt meter. If the image to be 
photographed covers the entire fluo- 
rescent screen, as it usually does with 
the five-inch tube, the millivolt reading 


‘is a measure of the brightness of the 


fluoroscopic image. This reading can 
thus be used as an index of film ex- 
posure. 














Figure 1. Major components of the cinefluorographic equipment: A, x-ray tube; B, image 
intensifier tube; C, fluoroscopic viewer, D, Auricon camera. 





ao == = AW WH fF ~~) wR Oe 8 





Moll: Cinefluorography in Speech Research 229 


The Auricon camera provides a film 
speed of 24 frames per second with a 
shutter opening of 1/48 sec. An f/1.0 
lens with a one-inch focal length is used 
on the camera. An optical, unilateral, 
variable-area sound track is recorded 
directly on the film. 

The subject is seated in a dental 
chair. A head positioner, shown in Fig- 
ure 1, consisting of ear rods and a plas- 
tic forehead bumper, is suspended from 
the ceiling above the chair. The posi- 
tioner is normally fixed so that the mid- 
sagittal plane of the subject will be at 
right angles to the central ray of the 
x-ray beam. Exact positioning, in order 
to obtain a view of the particular an- 
atomical area desired, is accomplished 
by observing the fluoroscopic image in 
the viewer. 


Cinefluorographic Procedures 

The effects of x-ray generator set- 
tings, x-ray filtration, film stock, and 
film processing on‘the definition of the 
articulatory structures were evaluated 
by single-frame projection of the cine- 
fluorographic pictures. Not all possible 
variations and combinations of factors 
were studied; rather, various types of 
film and types of x-ray filtration rec- 
ommended by other investigators were 
utilized. 

Generator Settings. The typical 
X-ray generator settings which were 
found to result in clear cinefluoro- 
graphic pictures were approximately 76 
kv and 5.0 ma for adult subjects and 
76 kv and 2.0 ma for children. The 
constant factor used to determine the 
settings for a particular subject was the 
exposure reading obtained from the 
millivolt meter which was described 
previously. Adjustment of the gener- 
ator settings to achieve the desired ex- 


posure reading are made while the 
subject is talking since the reading ob- 
tained will vary as the movements of 
the structures change the density of 
the field. Required exposure readings 
vary with the type of film and the type 
of film processing used. 

Radiation. Radiation measurements 
were made directly in the x-ray beam 
at the position usually occupied by the 
midline of a subject’s head (28 in. from 
the x-ray tube). At the typical gener- 
ator settings for adult subjects (76 kv 
and 5.0 ma) the measured radiation, 
when corrected for tissue density and 
backscatter, was 1.30 roentgens per 
minute. Since no subject is exposed for 
more than three minutes, the typical 
radiation dose of 3.9 r is well below the 
accepted maximum dose for an exam- 
ination of 20 to 25 r (18). The radia- 
tion dosage to the subject’s gonadal 
region was not measurable. The radia- 
tion to the operator was found to be 
well within safety limits. 

Film Stock and Film Processing. 
Four types of film were investigated 
in this study: Kodak Tri-X negative, 
Kodak Cineflure, Kodak Linagraph 
Shellburst, and DuPont 931A reversal. 
These film types had been used for 
cinefluorography by previous investi- 
gators (1, 16, 29). The clarity of the 
articulatory structures on each of the 
four films was evaluated by single- 
frame viewing. 

Of the four film types used, Kodak 
Linagraph Shellburst appears to give 
the best definition of the speech ar- 
ticulatory structures. The primary 
limitation of Shellburst is its slow emul- 
sion speed; however, the radiation re- 
quired to adequately expose this film 
(approximately 2.50 r per min) is still 








230 Journal of Speech and Hearing Research 





Ficure 2. Equipment used for tracing cine- 
fluorographic frames. 

relatively low. If a faster film is desired, 
DuPont 931A reversal appears to be 
the best alternative. Tri-X and Cine- 
flure, although providing high emulsion 
speeds, appear to be too ‘grainy’ for 
research purposes. 

The negative film was processed in 
Eastman D-76 developer at a tempera- 
ture of 70.7°F with developing times 
varying from 6.7 to 16.0 min. Increas- 
ing developing times beyond 10 min 
resulted in a prohibitive increase in 
graininess on Tri-X and Cineflure. For 
Shellburst film developed for the maxi- 
mum time of 16.0 min, the increase in 
graininess was relatively slight and con- 
trast was markedly increased. Utiliza- 
tion of special x-ray developer at 
higher temperatures, as suggested by 
other investigators (/9, 29), may prove 
of value with Shellburst film. 

X-Ray Filtration. Two types of fil- 
ters (2.0 mm of aluminum and 0.25 mm 
of copper) were used to filter the x-ray 
beam, and the effects of each type on 
image quality were evaluated. The 
aluminum filter was found to produce 
the best results. Although it is possible 
to reduce the radiation by approxi- 
mately 25% by using the copper filter, 
a significant loss in soft-tissue definition 
occurs. 

Radiopaque Marking. Marking the 
soft tissue structures with a radiopaque 


substance is not usually necessary to 
obtain good definition; however, mark- 
ing the tongue at the midline with a 
thin line of Rugar aids visualization 
during single-frame viewing. Such 
marking also permits identification 
of the midline when the tongue is 
grooved. Little spreading of the Rugar 
occurs during the first few minutes 
after application, during which time 
cinefluorographic pictures can be taken. 


Analyzing Procedures 


To analyze movements of the articu- 
latory structures, sequences of individ- 
ual cinefluorographic frames were 
traced and various measurements were 
made from the tracings. 


Tracing Apparatus. The equipment 
used for tracing cinefluorographic 
frames (Figure 2) consists of a Bell and 
Howell ‘Time and Motion Study Pro- 
jector’ and a tracing surface of quarter- 
inch plexiglass mounted in a steel 
frame. Both the projector and tracing 


. frame are mounted on slides so that the 


distance between them can be varied to 
adjust the enlargement of the pro- 
jected image. Tracing paper (Clear- 
print 1000H) is clamped to the tracing 
surface, the image is projected through 
the plexiglass onto the paper, and the 
tracing is made. Using cinefluorograph- 
ic pictures of a wire grid, it was de- 
termined that the cinefluorographic and 
tracing equipment do not produce 
measurable distortion of the projected 
image. 

Frame Identification. By visual in- 
spection of the variable-area sound 
track it is possible, for most sound se- 
quences, to define the limits of the 
individual speech sounds within reason- 
able limits of error, a technique similar 





bal 


—_— CF | 


Moll: Cinefluorography in Speech Research 231 


to that used previously in a motion 
picture study by Smith (26). Follow- 
ing determination of the limits of the 
individual speech sounds on the sound 
track, the cinefluorographic frames as- 
sociated with these sounds may be lo- 
cated; those frames exposed during the 
preliminary or ending articulatory 
movements of the speech production 
can be identified by viewing the film 
in slow motion. The frames thus identi- 
fied as making up the total speech pro- 
duction are numbered with a grease 
pencil so that they can be individually 
identified when projected. 

Enlargement Adjustment. The de- 
sired enlargement of the projected 
image for tracing is determined in rela- 
tion to a constant distance between 
selected structures (tip of the upper 
central incisor teeth and the posterior 
pharyngeal wall at rest) rather than to 
a constant enlargement factor for all 
subjects. This procedure is utilized so 
that a standard grid can be superim- 
posed on the tracings of different sub- 
jects in order to provide reference lines 
for measurements.” 

To determine the enlargement factor 
for each subject, a cinefluorographic 
sequence is taken of a lead plate which 
is notched at one centimeter intervals 
along one edge and which is placed 
vertically between the subject’s ante- 
rior teeth at the midline. By measuring 
the distances between the notches on 
the projected image the exact degree 
of enlargement can be determined. 

Tracing and Measuring Procedures. 
"A complete description of the enlargement 
adjustment procedure and the use of the grid 
may be found in the author’s dissertation, 
The Use of Cinefluorography in Speech Re- 


search: a Methodological Study, University of 
Iowa, 1960. 





Ficure 3. Line drawing of a cinefluorographic 
frame showing the four measurements made 
in the present study: P-P’, extent of velo- 
pharyngeal contact, V-P, velum-pharynx 
distance; T-A, tongue-alveolus distance, IO, 
incisal opening. The measures T-A and V-P 
represent minimum distances between struc- 
tures. 


After the enlargement of the projected 
image has been adjusted, tracings are 
made of the cinefluorographic frames 
previously identified as making up the 
speech movement to be studied. A line 
drawing showing the structures traced 
in the present study appears in Figure 3. 

Examples of measures which can be 
obtained from tracings of individual 
frames are presented in the following 
section on the reliability of the analyz- 
ing procedures. Measurements are made 
to the nearest .25 mm using dividers 
and a Paragon 1472P rule which is 
graduated in .50 mm divisions. If de- 
sired, measurements can be corrected 
to ‘life.size’ by use of the enlargement 
factor determined for each subject. 
Measurements between structures can 
be made without reference lines; how- 
ever, reference lines can be made avail- 
able by superimposition of a grid on 
the tracings. 

An alternative to the tracing-measur- 
ing procedure is to make the measure- 
ments directly from the projected 
image, without tracings, which has the 
advantage of being less time consum- 





232 Journal of Speech and Hearing Research 


ing; however, tracings then are not 
available for further measurement. 
Reliability of the Analyzing Proce- 
dures. To evaluate the accuracy with 
which measures can be derived from 
cinefluorographic frames by the pro- 
cedures described above, a reliability 
analysis was carried out using selected 
sequences from films on one subject. 

Four measures, illustrated in Figure 3, 

were used in the reliability analysis: 

(a) tongue-alveolus distance (T-A), 
closest distance between the tongue 
and the alveolus; 

(b) incisal opening (IO), distance be- 
tween biting edges of upper and 
lower central incisor teeth; 

(c) velum-pharynx distance (V-P), 
the closest distance between the 
posterior surface of the velum and 
the posterior pharyngeal wall; 

(d) extent of velopharyngeal contact 
(P-P’), the straight-line distance 
between the inferior and superior 
points of contact of the velum and 
the posterior pharyngeal wall. 


Four film sequences, consisting of * 


126 frames, were independently traced 


Taste 1. Agreement between measurements 
made on repeated tracings of cinefluoro- 
graphic frames. 








Number Pearson Mean 











Measure of Corre- Discrep- 
Frames lation ancy 
(in mm ) 
Tongue-Alveolus 
Distance (T-A) 123 99 0.54 
Incisal 
Opening (IO) 126 95 0.44 
Velum-Pharynx 
Distance (V-P) 62 90 0.66 
Extent of 
Velopharyngeal 
Costact (P-P’) 
Subject J 64 87 1.49 
Subject R 50 60 0.99 








and the tracings measured on two 
separate occasions. In the reliability 
analysis, all frames on which both 
measurements were zero were elimi- 
nated. Pearson correlation coefficients 
for all four measures (Table 1) are 
quite high, indicating good relative 
agreement of the repeated measure- 
ments. Mean discrepancies are ap- 
proximately .50 mm for three of the 
measures. For the measure of extent of 
velopharyngeal contact the mean dis- 
crepancy is much larger (1.49 mm). 
Since this relatively large error may 
have been due to lack of structural 
definition on the film of this particular 
subject (Subject J), the reliability 
analysis for this measure was repeated 
on film from a second subject (Subject 
R). Although for the second subject 
the mean discrepancy was decreased to 
0.99 mm (Table 1), the correlation co- 
efficient also decreased from .87 to .60. 

To determine the amount of meas- 
urement error, repeated measurements 
were made from the same tracings. The 
error involved in measurement was 
found to be relatively small, the largest 
discrepancy being 0.20 mm for the 
measure of extent of velopharyngeal 
contact. 

The accuracy of tracings was evalu- 
ated by measuring discrepancies be- 
tween outlines of selected structures 
(tongue, maxilla, velum, and posterior 
pharyngeal wall) on superimposed trac- 
ings of the same cinefluorographic 
frames. The measurements were made 
at three constant, selected points on 
each structure. The average error in- 
volved in tracing these structures ap- 
pears to be approximately .50 mm. 

To evaluate the reliability of meas- 
urements made directly from the pro- 








Fy we 


o— AD 


Moll: Cinefluorography in Speech Research 233 


Taste 2. Agreement between repeated meas- 
urements made directly from projections of 
cinefluorographic frames. 








Number Pearson Mean 





Measure of Corre- Discrep- 

Frames lation ancy 
(in mm ) 

Tongue-Alveolus 

Distance (T-A) 59 98 0.69 

Incisal 

Opening (IO) 61 96 0.60 

Velum-Pharynx 

Distance (V-P) 34 _ 82 0.87 

Extent of 

Velopharyngeal 

Contact (P-P’) 27 48 1.54 








jected images, repeated measurements 
were made on projections of the same 
frames. For the measures T-A and IO, 
the reliability data with this analyzing 
technique (Table 2) are similar to those 
obtained for the tracing-measuring pro- 
cedure. For the other two measures, 
however, the correlation coefficients 
are lower and the mean discrepancies 
are greater for measurements made di- 
rectly from projections than for those 
made from tracings. 


Evaluation of Analyzing Procedures. 
The procedures described above pro- 
vide the quantification of structural 
positions which would seem to be re- 
quired for systematic studies of speech 
movements. The accuracy with which 
such quantification can be achieved 
seems adequate for most of the meas- 
ures used. The importance of an aver- 
age error of .50 mm must, of course, 
be evaluated in relation to each particu- 
lar research purpose. 

The lower reliability of the measure 
of extent of velopharyngeal contact is 
probably due to the fact that both the 
velum and posterior pharyngeal wall 


are soft tissue structures. As a result, 
their outlines tend to ‘fuse,’ making it 
difficult to determine the exact points 
at which they make contact. 

It seems apparent that the accuracy 
of the analyzing procedures depends 
primarily on the definition of structures 
provided by the film being analyzed. 
The reliability data reported above re- 
late only to the film utilized, or to films 
of similar clarity. The results of this 
study demonstrate, however, that re- 
liable measures can be obtained with 
these analyzing procedures. 

The primary disadvantage of the 
tracing-measuring procedure is that it 
is time consuming. It is possible to trace 
approximately 40 frames per hour and, 
using the four measures previously 
described, to measure 35 frames per 
hour. Using the procedure of making 
measurements directly from the pro- 
jected image, it is possible to analyze 
completely approximately 35 frames per 
hour; however, this procedure may 
result in less reliability for some meas- 
ures. Although the time required 
imposes practical limitations on the 
number of subjects and the size of the 
speech sample which can be studied, 
there appear to be no alternative meth- 
ods now available for analyzing cine- 
fluorographic film. 

Demonstration Study 

The purposes of the pilot study 
were: (a) to demonstrate analytical 
procedures, methods of controlling 
pertinent variables, and methods of 
presenting cinefluorographic data, and 
(b) to obtain preliminary data on the 
physiological characteristics of selected 
vowel sounds under various conditions 
of speech production. 

Method. Subjects were two, young 








234 Journal of Speech and Hearing Research 


adult, females, neither of whom ex- 
hibited marked dental deviations or 
abnormal speech patterns. Cinefluoro- 
graphic pictures of the articulatory 
structures were taken while the subjects 
performed the following articulatory 
tasks: 


(a) production of the vowel sounds 
/i/ and /a/, each sustained for ap- 
proximately two seconds; 

(b) production of series of five /i/ and 
five /a/ vowels at a rate of one 
sound per second with a break in 
phonation between _ successive 
sounds in the series; 

(c) production of six disyllables, /iti/, 
/isi/, /ini/, /ata/, /asa/, and 
/ana/, at a syllable rate approxi- 
mating that used in connected 
speech. 

The subjects were instructed to use 
the pitch level that seemed natural for 
them on all phonations. On the series 
of vowels and on the disyllables the 
subjects were instructed to put equal 
stress on each component of the utter- 
ance. The subjects monitored the in- 
tensity of each phonation by means of 
a VU meter driven from a carbon 
throat microphone. The throat micro- 
phone was used to minimize interference 
from the noise of the cinefluorographic 
equipment. No attempt was made to 
equate the intensities of phonations in- 
volving different vowels since the 
vowels will normally differ in intensity. 
The subjects practiced the phonations 
until the investigator judged that the 
instructions as to pitch, stress, and in- 
tensity were being adequately followed. 

X-ray generator settings of 76 kv and 

5.0 ma were used for both subjects. 
DuPont 931A reversal film, developed 
by normal reversal processing, was 


used. Each subject received two min- 
utes of radiation so that the total dose to 
each subject was approximately 2.50 r. 

The frame-by-frame tracing-measur- 
ing procedures previously described 
were utilized for analyzing the cine- 
fluorographic film. All of the phona- 
tions, except the last three sounds in the 
series of five vowels, were analyzed. 
Tracings similar to that shown in 
Figure 3 were made from each frame. 
On some frames the tongue was 
grooved so that the midline, which was 
marked with Rugar, was inferior to the 
lateral edges. In these instances, meas- 
urements were made to the midline 
of the tongue. The four measurements 
previously described were made on 
each tracing. 

In addition to the cinefluorographic 
procedures, spectrograms were made on 
the Kay Sonograph from tape record- 
ings of selected portions of the film 
sound tracks. From the spectrograms 
the frequency of the second vowel for- 
mant was measured at intervals of .04 


“sec, the first measurement being made 


at the point at which phonation began. 
This time interval was selected because 
it approximated the rate of frame ex- 
posure. 


Results. The basic data derived from 
the cinefluorographic films were rep- 
resented graphically as shown in the 
sample graphs in Figures 4 and 5. The 
cinefluorographic frame numbers are 
given on the abscissa of each graph. 
The abscissa thus represents the time 
dimension since the films were exposed 
at a constant speed of 24 frames per 
second. The ordinates show the magni- 
tudes of the four structural measure- 
ments. Since the primary interest was in 
relative, rather than absolute, data com- 


Moll: Cinefluorography in Speech Research 235 

















Subject J 
sound track lil 
a ease 4 
T 
V-P f5 wa 4 
25=Fi10 
20- 
157 
OF on! 
5- 
o+0 
T-A Ls Pe ero ON 
O+10 
10 5- yp OVI ay 
rot Pa 4 pt 
ie) 10 20 30 440 sO 60 
modulation , Vil , lif 
V-P bs WV V 
1510 
10+ 
p-p'} ais pw 
o+0 





fe) 10 20 30 40 sO 60 
Frame No. (time) 


Figure 4. Measurements made from cineflu- 
orographic frames taken during the produc- 
tion of the vowel /i/, sustained and also in a 
series of five short phonations (one per 
second). The abscissa represents the time 
dimension, the ordinate, magnitude of struc- 
tural measurements. 


parisons, the measurements were not 
corrected for projection enlargement. 
The scale for each measure is arranged 
so that an increase in the proximity or 
in the extent of contact of the struc- 
tures is reflected by a point higher on 
the ordinate. At the top of each graph 
the duration of sound track modulation 
and the limits of each phoneme within 
the speech production are indicated. 
General Observations. Although the 
results of this study are limited to two 
subjects, a number of observations con- 
cerning the physiological characteristics 


of speech production can be made. 

It appears that the articulatory struc- 
tures seldom, if ever, assume static 
positions, even during the productions 
of sustained vowel sounds (Figure 4). 
Although some of the apparent varia- 
tion in measurements may be attributed 
to error introduced by the analyzing 
procedures, it is unlikely that variations 
of the magnitude observed (often great- 
er than 2 mm) are due entirely to 
error. 

Another observation which can be 
made from the data in Figures 4 and 5 
is that the closure of the velopharyngeal 
port (measure V-P) always precedes 
the beginning of phonation by a num- 
ber of cinefluorographic frames. Clo- 
sure also appears to precede the 
movement of the tongue into position 
for the vowel phonation. For example, 


sound track 4;; i i 
a mack Hil bh il, Subject J ti fy til 
5x\ J Me 
VP lo 


0 

















10 5 
toh Pepi or) 
0 © 2 » 0 0 2 30 
sound track dort; fap SPICER oynyy 
—r— RH 
70 70 
bi ee aii: & * ana, os 
1Si0 iO 
P-P' io; 1SF15 
5+0 of™ 104 
rs S+0 
r0 FS 
TA his Lio 
Lo Lis 
0+25 +20 
54 o-+2s 
1-0 10; 54 
154 104 
20L py ae . ig es ficnsiiancbeeats 
o 10 20 30 40 o 0 2 3% 


Figure 5. Measurements made from cineflu- 
orographic frames taken during the produc- 
tion of various disyllables by two subjects, 





236 Journal of Speech and Hearing Research 


on the first /i/ sound in the vowel 
series shown in Figure 4, closure occurs 
six frames before the vowel phonation 
begins. When closure occurs the tongue 
is still approximately 4 mm from the 
average position it will assume during 
the vowel sound. It can also be noted 
from all of the graphs that velopharyn- 
geal contact is broken simultaneously 
with, or slightly before, the end of 
phonation. 

From the measurements of extent of 
velopharyngeal contact (P-P’) it ap- 
pears that when the velopharyngeal 
aperture is closed, the velum oscillates 
against the posterior pharyngeal wall. 
This oscillation is quite noticeable also 
when the cinefluorographic pictures are 
viewed in moving projection. 

The data also suggest that the meas- 
ures of tongue-alveolus distance (T-A) 
and incisal opening (IO) are closely 
related. The tongue and the lower jaw 
appear to move somewhat synchro- 
nously. It seems probable that much of 
this aspect of tongue movement can be 
accounted for by movement of the 
lower jaw. 

Consonant Sounds. The articulatory 
positions for the production of the con- 
sonant sounds studied are of short dura- 
tion. On most of the productions of 
the /t/ sound, tongue-alveolus contact 
is maintained for only one or two 
frames. Due to the short duration of 
such contact, it is possible to miss the 
contact completely, as illustrated by the 
data plotted in Figure 5 for the disyl- 
lable /ata/ of Subject R. Since a con- 
tact is undoubtedly necessary to 
produce the /t/ sound, it is probable 
that the contact in this syllable was 
extremely short and occurred during 
the closed phase of the camera shutter, 


that is, between frames of the film. 
Tongue-alveolus contact was missed 
also on one of the syllables involving 
the /n/ sound. 

For none of the /s/ sounds studied 
was there tongue-alveolus contact at 
the midline, although the distance be- 
tween the two structures decreased to 
1 or 2 mm (Figure 5). It was noted 
from the projected pictures, however, 
that there was contact of the tongue 
and alveolus laterally. Such a tongue 
position is consistent with classical de- 
scriptions of the production of the /s/. 


Vowel Sounds. Although, as men- 
tioned previously, there is great varia- 
tion of the measures during vowel 
production, there seem to be certain 
average positions of the tongue and jaw 
which are characteristic for each vowel. 
For example, on the vowel /i/, shown in 
Figure 4, the average tongue-alveolus 
distance is approximately 3.5 mm and 
the usual incisal opening is 4.0 mm. 
These average positions seem to apply 
to both the sustained vowel sounds and 
to the vowels in a series phonated by 
this subject. Except for the post-con- 
sonant /a/ vowels, the average positions 
also apply to the vowels in the disyl- 
lables. For the post-consonant /a/, 
however, the tongue was observed to 
be closer to the alveolar ridge and the 
incisal opening to be less than for the 
other /a/ vowels. This is illustrated 
by the data in Figure 5 for the disyl- 
lable /ata/. It was also noted that for 
the preconsonant vowels, the tongue 
appears to glide through the positions 
which are characteristic for the sus- 
tained vowel production. In most of the 
disyllables (Figure 5) there seems to be 
no steady-state portion of the initial 
vowel. 





. 2. Cae bee. am wae bn - oe 


aon 





Moll: Cinefluorography in Speech Research 237 


Taste 3. Mean extents of velopharyngeal contact for the vowels /i/ and /a/ produced in various 
phonation conditions by two subjects. Measurements are in millimeters and are corrected for projec- 
tion enlargement so as to represent ‘life size’ for each subject. 











Vowel Phonation Vowel /i/ Vowel /a/ 
Condition Subject J Subject R Subject J Subject R 
Sustained 11.55 12.53 9.42 0.00 
In Vowel Series 7.87 7.86 6.53 3.01 
Preceding /t/ 10.10 8.20 9.05 9.11 
Preceding /s/ 9.96 9.33 8.63 8.68 
Preceding /n/ 9.05 3.20 8.55 0.00 








On most of the vowel phonations 
complete closure of the velopharyngeal 
port was observed. Subject R, however, 
never achieved closure on the sustained 
/a/ and seldom obtained it during 
phonation of the /a/ in a series. In 
disyllables involving the nasal con- 
sonant /n/ (Figure 5), there is usually 
contact of the velum and posterior wall 
for the vowel preceding the consonant; 
however, there is no velopharyngeal 
closure for the final vowel. This ob- 
servation was consistent for all of the 
/n/ disyllables studied. This finding 
may be a result of the fact that in the 
disyllables the final vowel is probably 
in the same syllable as the nasal con- 
sonant while the initial vowel is in the 
preceding syllable. 

In an attempt to study the relation- 
ships between velopharyngeal closure 
and vowel production and between 
closure and vowel phonation condition, 
the measures of extent of velo- 
pharyngeal contact (P-P’) were av- 
eraged for each vowel phonation. Only 
measurements for frames which oc- 
curred during the actual vowel phona- 
tion were averaged. In the disyllables, 
only measures for the preconsonant 
vowel were used, since there was no 


velopharyngeal contact for the post- 
consonant vowels in the /n/ syllables. 
The average measurements for each 
vowel phonation were corrected for 
projection enlargement so as to repre- 
sent ‘life size’ for each subject and are 
presented in Table 3. 


There appears to be a definite varia- 
tion in average contact on vowel sounds 
as a function of the speech condition. 
Except for the /a/ sound for Subject 
R, there is greater contact for the sus- 
tained vowels than for the vowels in 
other conditions. For the vowels in 
consonant contexts, those associated 
with the /t/ and /s/ exhibit slightly 
more contact than those in the /n/ 
syllables, although the differences are 
of relatively small magnitude in some 
instances. A consistent difference be- 
tween the extent of velopharyngeal 
contact can also be noted between the 
two vowels studied. The contact for 
/a/ is less than that for /i/ in almost all 
conditions. This finding is consistent 
with those of previous investigators 
(14, 30). It was also noted that, although 
Subject R achieved no velopharyngeal 
closure for the sustained /a/ or /a/ 
associated with the /n/ sound, she 
achieved closure when this vowel was 








238 Journal of Speech and Hearing Research 


sound track fol fol fast el ali oh 





T-A 


(mm) 
150025 


a 13004 4 
aise! / Ww jf 5 


(cps) 900 








' ; ppt en, 

a5 6 25 6 2 
Frame No. (time) 

Ficure 6. Relationship of tongue-alveolus 

distance and second formant frequency on 

/a/ disyllables produced by one subject. 


30 30 


phonated in a vowel series and when it 
was associated with the consonants /t/ 
and /s/ in syllables. 

Tongue Movement and Formant Fre- 
quency Change. In Figure 6 the 
tongue-alveolus distance and the fre- 
quency of the second vowel formant 
are plotted for the /a/ disyllables of 
Subject R. Similar data plots were made 
for the /i/ disyllables of this subject 
but are not shown here. The time scales 
of the two measures are aligned so that 
the first cinefluorographic frame of the 
initial vowel coincides with the begin- 
ning of the phonation as determined 
from the spectrogram. 

The direction of the formant fre- 
quency changes during the consonant- 
vowel transitions is consistent with the 
data of Schultz (25). It can be observed 
that the change in second formant 
frequency corresponds grossly to the 
change in the tongue-alveolus measure. 


Discussion 

The results of this study confirm 
the conclusion of previous investigators 
that cinefluorography is a very prom- 
ising technique for studies of speech 
articulation. Cinefluorographic pictures 
having adequate definition to permit 
measurement of the articulatory struc- 
tures during single-frame projection 


can be obtained at tolerable’ radiation 
dosages. The frame-by-frame tracing- 
measuring procedures, although time 
consuming, yield data which appear to 
be of acceptable reliability for most re- 
search purposes. The pilot study dem- 
onstrates that useful and consistent data 
on speech physiology can be obtained 
with cinefluorographic techniques. 
Although there are many advantages 
in using cinefluorography in speech re- 
search, certain limitations of this tech- 
nique still exist. The five-inch image 
intensifier limits study to a small 
anatomical area. On the equipment 
utilized-in this study, the subject’s head 
cannot be placed directly against the 
intensifier tube. The anatomical field 
of view is thus less than the five-inch 
diameter of the intensifier receiving 
screen, approximately a four-inch di- 
ameter circle is usually obtained. Even 
with this small area it is usually possible 
to obtain views of all articulatory struc- 
tures from the anterior teeth to the 
posterior pharyngeal wall and from 
above the palatal plane to the hyoid 
bone. On large subjects, however, it is 
sometimes not possible to photograph 
all of the structures of interest. Modi- 
fication is now being effected on the 
equipment used in this study so that 
the subject’s head can be placed closer 
to the intensifier, thus increasing the 
field size slightly. Relief from this lim- 
itation could be obtained by using a 
larger intensifier such as the Philips 
nine-inch tube, but at additional cost. 
As a result of the relatively small 
field of view, few stable reference 
structures can be photographed. Struc- 
tures which have been commonly used 
to establish reference lines for measure- 
ments are usually not included. It is 





Moll: Cinefluorography in Speech Research 239 


thus necessary to establish reference 
lines by other means, for example, by 
use of a superimposed grid. There is, of 
course, some question concerning the 
desirability of describing speech articu- 
latory positions in reference to certain 
fixed anatomical structures which may 
have no functional relationship to 
speech production. Even without ref- 
erence lines, however, useful data on 
speech physiology can be obtained. 

The primary limitation of cine- 
fluorography, at least with a camera 
speed of 24 frames per second, is that 
it does not provide an adequate sam- 
pling of some speech articulatory move- 
ments. For movements of certain 
structures this sampling rate appears 
to be adequate; however, for tongue 
movements a faster camera speed is 
needed. As mentioned above, tongue 
contacts of short duration can often be 
missed between cinefluorographic 
frames. Even if the contacts can be 
visualized, they often are of only one 
frame duration. Tongue movements 
also appear to be very rapid; the tongue 
often moves a distance of 20 milli- 
meters or more in two or three frames. 
Such sampling is surely not adequate 
for studying details of tongue move- 
ment. 

The use of faster camera speeds, how- 
ever, presents a number of problems. As 
the exposure time for each frame is 
decreased the intensity of radiation nec- 
essary to provide adequate film ex- 
posure for each frame must be 
increased. It is possible, however, that 
improvements in film processing will 
allow some increase in camera speed 
without a great increase in radiation 
over the relatively low levels now 
utilized. 


A major problem involved in increas- 
ing camera speed is that standard, high- 
speed photographic equipment provides 
no method of obtaining a sound track. 
If a sound track is necessary, and it 
appears to be for speech research, spe- 
cial sound recording equipment must be 
used. Equipment for obtaining pictures 
and a synchronized sound track at 
camera speeds up to 72 frames per 
second has recently been obtained for 
the cinefluorographic equipment utilized 
in this study: Whether or not this in- 
crease in speed will provide adequate 
sampling of speech movements remains 
to be determined. 

The data obtained in this study, al- 
though limited to two subjects and 
extremely small samples of speech, sug- 
gest topics for future research. 

It appears that velopharyngeal closure 
on vowels may vary systematically, not 
only as a function of the vowel sound 
produced, but also as a function of the 
phonetic context of the vowel. The 
variation in closure for consonant 
sounds also needs investigation. These 
relationships should be studied for nor- 
mal speakers and for speakers who 
exhibit speech deviations presumably re- 
lated to velopharyngeal dysfunction. 

The variation in tongue and jaw 
positions for speech sounds in.a variety 
of phonetic contexts should be studied 
for subjects with both normal and ab- 
normal speech. The time relationships 
between various structural movements 
for different types of speakers and for 
different speech sequences also might 
be investigated with cinefluorography. 

Another area of possible investigation 
involves the relationships between the 
physiological and acoustic character- 
istics of speech. That these types of 








240 Journal of Speech and Hearing Research 


relationships can be studied, at least 
grossly, is demonstrated by the results 
obtained on the relationship between 
tongue-alveolus distance and second 
formant frequency of vowel sounds in 
this study. A limitation of such investi- 
gation is that the acoustic character- 
istics of speech undoubtedly depend on 
physiological changes in three dimen- 
sions while cinefluorographic data de- 
pict only two dimensions of the system. 
A further problem for this type of 
study is that the noise of the rotating 
anode produces a significant interfer- 
ence in simultaneous recordings of the 
speech sample. In the spectrographic 
analysis made in this study it appeared 
that the noise contains little energy be- 
low 3000 cps, so that it interferes very 
little with the vowel formants. How- 
ever, high frequency consonant energy 
on spectrograms is effectively masked 
by the noise. 


The results of the present study re- 
veal few, if any, characteristics of speech 
articulation which appear to be con- 
stant with time during speech produc- 
tion. Positions of the articulatory 
structures were shown to vary almost 
constantly, even during productions of 
isolated, sustained vowel sounds. This 
observation lends support to the notion 
that a single, cross-section-in-time analy- 
sis, such as that provided by single- 
exposure x-ray procedures, has serious 
limitations as a basis for description of 
the physiological characteristics of 
speech. If this much is granted, it 
would appear that almost every phase 
of speech which has been studied by 
single-exposure techniques should be re- 
studied using cinefluorographic pro- 
cedures so that the dynamic changes in 
articulatory positions can be observed. 


Summary 


This study was designed to investi- 
gate the methodological problems in- 
volved in using cinefluorography for 
studies of the physiological character- 
istics of speech articulation. The results 
of experimentation with x-ray generator 
settings, x-ray filtration, film stock, and 
film processing were presented. It was 
concluded that pictures which provide 
adequate definition for measurements 
of structures can be obtained with 
tolerable radiation dosages. 

A frame-by-frame tracing-measuring 
procedure by which cinefluorographic 
information can be quantified was de- 
scribed. Reliability analyses indicated 
that the analyzing procedures provide 
data of adequate reliability for most 
research purposes. 

The results of a pilot study of two 
adult subjects with normal speech point 
up the limitations of single-exposure 
radiography in studies of speech physi- 
ology and suggest possible areas for 
future cinefluorographic research. Vari- 
ous limitations of cinefluorography 
were considered. It was concluded that 
this technique is a valuable research 
procedure for studies of the physiology 
of speech articulation. 


Acknowledgment 


The author is indebted to Dr. D. C. 
Spriestersbach, principal investigator of 
grants M-1158 and D-853, for his as- 
sistance and support throughout the 
course of this investigation. 


References 


1. Berry, H. M., Jr., and Hormann, F. A., 
Cineradiographic observations of tem- 
poromandibular joint function. J. prosth. 
Dent., 9, 1959, 21-33. 





So 


10. 


12. 


. Cottman, J. 


6. 


Moll: Cinefluorography in Speech Research 


. Buck, M., Post-operative velo-pharyngeal 


movements in cleft palate cases. J. Speech 
Hearing Dis., 19, 1954, 288-294. 


. Campeti, F. L., Diagnostic analysis of 


cinefluorograms. Chap. 13 in G. H. S. 
Ramsey, J. S. Watson, Jr., T. A. Tristan, 
S. Weinberg, and W. S. Cornwell (Eds.), 
Cinefluorography. Springfield, Ill.: Charles 
C. Thomas, 1960. 


. Carrett, J., A cinefluorographic tech- 


nique for the study of velopharyngeal 
closure. J. Speech Hearing Dis., 17, 1952, 
224-228. 

W., Fluoroscopic image 
brightening by electronic means. Radiol- 
ogy, 51, 1948, 359-367. 

Cooper, F. S., Detarrre, P. C., LiserMAN, 
A. M., Borst, J. M., and Gerstman, L. J., 
Some experiments on the perception of 
synthetic speech sounds. J. acoust. Soc. 
Amer., 24, 1952, 597-606. 


. Coorer, H. K., Cinefluorography with 


image intensification as an aid in treat- 
ment planning for some cleft lip and/or 
cleft palate cases. Amer. J. Orthodont., 
42, 1956, 815-826. 


. Fietcuer, S. G., Analysis of cinema films 


in diagnosis and research. J. biol. photogr. 
Ass., 26, 1958, 29-33. 


. Fretcuer, S. G., A cinefluorographic 


study of the movements of the posterior 
wall of the pharynx during speech and 
deglutition. M.S. thesis, Univ. Utah, 1957. 
Fietcuer, S. G., SuHetron, R. L., Jr., 
Smitu, C. C., and Bosma, J. F., Radi- 
ography in speech pathology. J. Speech 
Hearing Dis., 25, 1960, 135-144. 


. Hacerry, R. F., and Hirt, M. J., Pharyn- 


geal wall and palatal movement in post- 
operative cleft palates and normal palates. 
J. Speech Hearing Res., 3, 1960, 59-66. 
Hrxon, E., An x-ray study comparing 
oral and pharyngeal structures of individ- 
uals with nasal voices and individuals 
with superior voices. M.S. thesis, Univ. 
Iowa, 1949. 


. Horsroox, R. T., and Carmopy, F. J., 


X-ray studies of speech articulations. 
Univ. Calif. Publ. mod. Philol., 20, 1937- 
1941, 187-237. 


. Katrensorn, A. L., An X-ray study of 


velopharyngeal closure in nasal and non- 
nasal speakers. M.A. thesis, Northwestern 
Univ., 1948. 


. Kirkpatrick, J. A., and Otmstep, R. W., 


Cinefluorographic study of pharyngeal 
function related to speech. Radiology, 73, 
1959, 557-559, 


16. 


17. 


18. 


19, 


20. 


ak. 


yj. 


yas 


26. 


Pe 


28. 


29. 


241 


Kuatre, E. C. Campsert, J. A., and 
Luriz, P. R., Technical factors in selec- 
tive cinecardioangiography. Radiology, 
73, 1959, 539-547. 

Lotz, J., The structure of human speech. 
Trans. N. Y. Acad. Sci., Series Il, 16, 
1954, 373-384. 

Lustep, L. B., and Mutter, E. R., Prog- 
ress in indirect cineroentgenography. 
Amer. J. Roentgenol., 75, 1956, 56-62. 
Miter, E. R., Cinefluorography in prac- 
tice. Radiology, 73, 1959, 560-565. 
Morean, R. H., Screen intensification: A 
review of past and present research with 
an analysis of future development. Amer. 
J. Roentgenol., 75, 1956, 69-76. 

Nytén, B., A preliminary study by x-ray 
movie of cleft palate function pre- and 
post-operatively. Trans. int. Soc. Plast. 
Surg., 1957, 220-224. 

Ramsey, G. H. S., Watson, J. S., Jr., 
Tristan, T. A., Wernperc, S., and Corn- 
weELL, W. S., (Eds.), Cinefluorograpby. 
Springfield, Ill.: Charles C. Thomas, 1960. 


. Russett, G. O., First preliminary x-ray 


consonant study. J. acoust. Soc. Amer., 
5, 1934, 247-251. 


. Russert, G. O., The Vowel, its Physio- 


logical Mechanism as Shown by X-Ray. 
Columbus: Ohio State Univ. Press, 1928. 
Scuuttz, M. C., A preliminary investiga- 
tion of the acoustical characteristics of 
interphonemic transitions. Ph.D. disserta- 
tion, Univ. Iowa, 1955. 

SmitH, G. A., A motion picture study 
comparing lip and jaw movement and 
area of mouth opening of nasal and non- 
nasal speakers. M.A. thesis, Univ. Iowa, 
1950. 

Sretson, R. H., Hupeins, C. V., and 
Moses, E. R., Jr. Palatograms change 
with rates of articulation. Arch. néerl. 
Phon. exp., 16, 1940, 52-61. 

SusteLNy, JoaANNE D., A laminagraph 
study of nasalized vowels produced by 
cleft palate speakers. Ph.D. dissertation, 
Northwestern Univ., 1956. 

Watson, J. S., Jr. and Wensere, S., 
Evaluation of equipment, films and proc- 
essing for cinefluorography. Chap. 9 in 
G. H. S. Ramsey, J. S. Watson, Jr., T. A. 
Tristan, S. Weinberg, and W. S. Corn- 
well (Eds.), Cinefluorography. Spring- 
field, iil.: Charles C. Thomas, 1960. 


. Wiuiams, R. L., A serial radiographic 


study of velopharyngeal closure and 
tongue positions in certain vowel sounds. 
Northw. Univ. Bull., LIl, 17, 1952, 9-12. 








Color-Form Attitudes of Deaf Children 


DONALD G. DOEHRING 


Children who are deaf may exhibit 
enhanced, impoverished, distorted, or 
normal visual functioning. Each of 
these contingencies has been reported 
for specific visual abilities (4). The 
lack of agreement among studies has 
resulted in considerable confusion re- 
garding the effect of deafness on visual 
functioning in general, and suggests the 
need for continued study of specific 
visual. processes in deaf children. In 
the present experiment an attempt was 
made to determine how deaf children, 
as compared with hearing children, 
categorize certain types of visual stim- 
uli and, more specifically, whether in 
differentiating among visual stimuli 
deaf children make use of the attributes 
of color, size, and shape in the same 
manner as do hearing children. 
When an individual is asked to dif- 
ferentiate among two-dimensional stim- 
ulus objects he can, among other things, 
categorize the objects according to 
their color or according to such form 
characteristics as size and shape. In 
previous investigations of color-form 
responding, the tendency to respond 





Donald G. Doehring (Ph.D., Indiana Uni- 
versity, 1954) is Assistant Professor of Medi- 
cal Psychology, Department of Surgery, 
Indiana University School of Medicine. This 
study was conducted while he was Research 
Associate, Central Institute for the Deaf. The 
study was supported by a grant (B-1718) 
from the National Institutes of Neurological 
Diseases and Blindness of the National In- 
stitutes of Health. 


Volume 3, No. 3 


242 


on the basis of one or the other stim- 
ulus quality has been related to level 
of affective functioning (1, 5), to per- 
sonality characteristics (3, 5), and to 
developmental level (2, 3). No previous 
investigation has been concerned with 
a direct comparison of the color-form 
attitudes of deaf children with those 
of hearing children. However, the per- 
formance of deaf children on a sorting 
test in experiments by McKay (7) and 
Larr (6) has been interpreted as in- 
dicative of a color-form responding 
tendency. 


The present experiment involved a 
specific test of the color and form at- 
titudes of large groups of 8- to 12-year- 
old deaf and hearing boys and girls. 
In order to determine whether color 
and form responses on this particular 
test changed as a function of chrono- 
logical age, the test was also ad- 
ministered to a group of very young 
hearing children and to a group of 
hearing adults. 


Method 


Subjects. Subjects tested were 95 
deaf children ranging in age from eight 
to 12 years, the group including 45 
girls and 50 boys. In the same age 
range 90 hearing children were tested, 
this group including 45 girls and 45 
boys. The two groups were well 
matched in chronological age. The 
mean age of the deaf boys was 10.41 


September 1960 





Doehring: Color-Form Attitudes of Deaf Children 243 


(SD = 1.70), the mean age of the 
deaf girls was 10.31 (SD = 1.70), the 
mean age of the hearing boys was 
10.42 (SD = 1.76), and the mean age 
of the hearing girls was 10.36 (SD = 
1.83). Deaf children were from three 
schools for the deaf: a private school, 
a parochial school, and a city public 
school. The hearing children were 
from a private school, a_ parochial 
school, a city public school, and two 
suburban public schools. 


The group of younger hearing chil- 
dren was composed of 47 4- and 5- 
year-old children, 24 girls and 23 boys, 
who attended a university-sponsored 
nursery school. Eight girls and seven 
boys who could not perform the task 
correctly were eliminated. Therefore, 
the final group of nursery school chil- 
dren included 16 girls and 16 boys. 

Hearing adults tested were 33 stu- 
dents from an evening college class, 
this group including 20 males and 13 
females. The age range was 18 to 35. 
The mean age of the males was 23.9 
(SD = 3.6) and the mean age of the 
females was 25.4 (SD = 4.36). 


Procedure. The test used was similar 
to tests used in previous investigations 
of color-form attitude (2, 3, 5). The 
test material consisted of 28 cards that 
were presented to the subject one at a 
time. On each 4” x 6” tan card were 
pasted three geometric forms that had 
been cut from colored construction 
paper. These visual stimuli could differ 
from each other in the attributes of 
color, shape, and size, and the subject 
was required to point to the figure that 
he judged to be most different among 
the three geometric forms on a given 
card. Within each of the three stimulus 
attributes, two values were used. The 


two colors used were red and blue, the 
two shapes were circle and rectangle, 
and the two sizes were designated as 
large and small. All possible combina- 
tions of these values were employed, 
with the result that there were eight 
different test figures: a large red circle, 
a large blue circle, a large red rectangle, 
a large blue rectangle, a small red circle, 
a small blue circle, a small red rec- 
tangle, and a small blue rectangle. The 
values within each stimulus attribute 
were selected arbitrarily with the in- 
tention that the difference between 
the two values for a given attribute 
should not be extreme, but should be 
easily discriminable. A preliminary in- 
vestigation was conducted to select 
small rectangles that were subjectively 
equivalent in size to the small circles, 
and large rectangles that were subjec- 
tively equivalent in size to the large 
circles. The small circles were .75 in. 
in diameter; the large circles were .94 
in. in diameter; the small rectangles 
were .75 in. in height and .56 in. in 
width; and the large rectangles were 
1.00 in. in height and .75 in. in width. 
Red and blue were selected because 
the one could be discriminated from 
the other even by color-blind individ- 
uals. 


The first eight cards that were pre- 
sented to the subject were training 
cards. On each training card two of the 
test figures were identical in color, 
shape, and size, and the remaining 
figure differed from the other two in 
all three of the attributes. Each of the 
eight cards contained a different com- 
bination of test figures. Since only one 
correct response was possible when 
the subject was asked to point to the 
most different figure on the training 








244 Journal of Speech and Hearing Research 


BLUE RED 























Figure 1. Upper: Example of a test card in 
which each figure is similar to each of the 
other figures in only one attribute. Lower: 
Example of a test card in which one of the 
three figures is similar to the other two 
figures in two attributes, and each of the 
remaining figures is similar to the other two 
figures in only one attribute. NY 











cards, the trials with these cards served 
to clarify the subject’s task and thereby 
to eliminate the need for detailed verbal 
instructions. 

On the 20 test cards the figures were 
related in two different ways: 

(a) On eight of the cards no two 
figures were alike in more than one 
attribute, and, as a consequence, each 
of the three figures on a card differed 
from the other two figures in two 
attributes. There was no correct re- 
sponse, and the subject’s selection of 
the figure that appeared most different 
would be based on the attribute within 


which the differences in values ap- 
peared to be greatest. A different com- 
bination of test figures was used for 
each of the eight cards. One of these 
cards is shown in the upper part of 
Figure 1. On this card, a subject for 
whom the color difference appeared to 
be largest would point to the large red 
circle, a subject for whom the shape 
difference appeared to be largest would 
point to the large blue rectangle, and a 
subject for whom the size difference 
appeared to be largest would point to 
the small blue circle. 


(b) On the remaining 12 cards, one 
of the three figures was similar to the 
other two figures in two attributes, and 
each of the remaining figures was sim- 
ilar to the other two figures in only 
one attribute. Thus, a response to one 
of the stimuli would be considered in- 
correct, and the subject’s choice be- 
tween the remaining two stimuli would 
indicate the attribute within which the 
largest difference appeared to occur. 
This arrangement was possible only 
when one dimension was held constant. 
Each of the 12 cards contained a dif- 
ferent combination of test figures and 
the value for a given attribute was held 
constant on four of the cards. One of 
these cards is shown in the lower part 
of Figure 1. For this card, where size 
was held constant, a subject who 
pointed to the blue circle would be 
responding to the color difference, a 
subject who pointed to the red rec- 
tangle would be responding to the 
shape difference, and a subject who 
pointed to the red circle would have 
made an incorrect response. 


The eight training cards were always 
presented first, and the remaining 20 
cards were then presented in random 





Doebring: Color-Form Attitudes of Deaf Children 245 


TaBLE 1. Mean proportion of shape responses made by males and females within each group. Numbers 
in parentheses are means of proportions transformed into units of 2 arcsin p”%. 











Group Age Male Female 
N Proportion N Proportion 
Deaf 8 to 12 50 .386 (1.27) 45 .53 (1.66) 
Hearing 8 to 12 45 .58 (1.78) 45 -66 (1.94) 
Hearing 4to 5 16 .56 (1.72) 16 -70 (2.04) 
Hearing Adult 21 .54 (1.65) 13 .73 (2.01) 








order. In most cases the only instruc- 
tion given to the subject was ‘Point 
to the one that looks most different.’ 
The concept of ‘difference’ was ex- 
plained more fully to some of the 
nursery-school children, and it was 
necessary to add the instruction ‘Point 
to the one that does not look the same’ 
for some of the deaf children. During 
the test series some subjects asked 
which of the three attributes should be 
used to differentiate the figures, and 
some subjects commented that all of 
the figures looked different. The ex- 
perimenter’s response to both state- 
ments was ‘Just point to the one that 
looks most different to you.’ 

On the training cards the experi- 
menter positively reinforced the first 
three or four correct responses by nod- 
ding and saying ‘That’s right.’ When 
the subject pointed to one of the two 
identical figures on a training card, he 
was required to point again until he 
pointed to the figure that was different. 
If a subject made more than three er- 
rors on the training cards, the training 
series was repeated. The task was dis- 
continued when an error was made 
during the repetition of the training 
series, or when the subject perseverated 
with a position response during the test 
series. All of the deaf children and the 
older hearing children were able to 
perform the task correctly, but, as 


mentioned above, the task proved to be 
too difficult for 


15 of the nursery- 
school children. 


Results 


There were 20 responses recorded 
for each subject. Since each of the 
three attributes was held constant on 
four of the 20 test cards, the maximum 
number of responses that could be 
based upon a single attribute was 16. 
Only four of the 250 subjects who 
were tested made more than four size 
responses, possibly because the majority 
of subjects tended to confound size 
differences with shape differences. For 
the analysis of results, therefore, re- 
sponses based upon size were not used, 
and responses based upon shape were 
considered to be form responses. A 
single score was obtained for each sub- 
ject by dividing the number of shape 
responses by the total number of shape 
and color responses. With scores com- 
puted in this manner, subjects who re- 
sponded mostly on the basis of shape 
would receive a high score, and sub- 
jects who responded mostly on the 
basis of color would receive a low 
score. 


Table 1 shows the mean proportion 
of shape responses made by males and 
females within each group. It can be 
seen that within each group the males 








246 Journal of Speech and Hearing Research 


TasLe 2. Summary of analysis of variance for 
evaluating differences between males and females 
and between deaf and hearing children (aged 8 
to 12 years) in making response to shape. 








Source df ms ig 





Hearing Status (H) 1 1552 18.05 
Sex (S) 1 .0741 8.62 
HxS 1 .01380 1.51 
Error 181 .0086 

Total 184 








* An F of 3.91 (df = 1, 150) is required for sig- 
nificance at the 5% level, and an F of 6.81 is 


required for significance at the 1% level. 


made a considerably smaller propor- 
tion of shape responses and, conse- 
quently, a larger proportion of color 
responses than the females. Within the 
three hearing groups, the males tended 
to make shape responses on slightly 
more than half of the trials, while the 
females tended to make shape responses 
on from two-thirds to three-fourths of 
the trials. Deaf children of each sex 
tended to make more color responses 
than their hearing counterparts. The 
deaf boys tended to make shape re- 
sponses on slightly more than one-third 
of the trials and the deaf girls tended 
to make shape responses on slightly 
more than one-half of the trials. 


TaBLe 3. Summary of analysis of variance for 
evaluating differences between males and females 
and between young (age 4 to 5 years) and adult 
hearing subjects in making response to shape. 














Source df ms F* 
Age (A) 1 .0025 
Sex (S) 1 .1156 4.59 
AxsS5S 1 .0004 
Error 61 .0252 
Total 64 








* An F of 4.00 (df = 1, 60) is required for sig- 
nificance at the 5% level. 


The differences in color-form at- 
titude between males and females and 
between deaf children and hearing chil- 
dren were significant, as shown by the 
statistical analyses summarized in 
Tables 2 and 3. In order that the dis- 
tribution of scores might more closely 
approximate a normal distribution, all 
proportions were transformed to 2 arc- 
sin p’/? (in radians) before statistical 
analysis (8, pp. 423-424). The per- 
formance of the deaf children was com- 
pared with that of the 8- to 12-year- 
old hearing children by the analysis 
summarized in Table 2. Differences in 
color-form attitude, both between boys 
and girls and between deaf children 
and hearing children were significant 
beyond the 1% level. Since the inter- 
action of sex and hearing status was 
not significant, there is no evidence 
that the difference in color-form at- 
titude between boys and girls depends 
upon the group (deaf or hearing) to 
which the children belong. The per- 
formance of the nursery-school chil- 
dren was compared with that of the 
adults by the analysis summarized in 
Table 3. The difference between sexes 
was significant beyond the 5% level, 
and, as would be expected from the re- 
sults shown in Table 1, neither the age 
effect nor the interaction of age and 
sex approached significance. 


Discussion 


The results of this experiment show 
that more deaf children than hearing 
children based their responses upon 
color differences in a situation where 
responses can be based upon differ- 
ences in either the color or the shape 
of two-dimensional visual stimuli. Con- 
sequently, it may be said that deaf 


Doebhring: Color-Form Attitudes of Deaf Children 247 


children do not perceive visual stimuli 
in the same manner as do hearing chil- 
dren. The magnitude of this difference 
in color-form attitude can be assessed 
by a comparison with the magnitude 
of the observed difference between the 
color-form attitudes of males and fe- 
males. As shown by the results pre- 
sented in Table 1, the difference 
between the proportions of shape re- 
sponses made by deaf boys and hearing 
boys and between those for deaf girls 
and hearing girls was of the same order 
of magnitude as the differences be- 
tween the boys and the girls within 
each group. Consequently, among the 8- 
to 12-year-olds who were tested, the 
hearing girls showed the greatest tend- 
ency to respond to shape differences, 
the deaf girls tended to display approxi- 
mately the same color-form attitude as 
the hearing boys, and the deaf boys 
showed the greatest tendency to re- 
spond to color differences. 


Although color-form attitude has 
been found to change systematically as 
a function of age in a number of pre- 
vious investigations (2, 3), in the pres- 
ent study there was no evidence of 
change in the distribution of color- 
form attitudes between the group of 
hearing nursery-school children and the 
group of hearing adults. Therefore, the 
bias toward color responding displayed 
by the deaf children should probably 
not be interpreted as indicative of a re- 
tardation in the development of visual 
perceptual processes. 


It must be concluded that deaf chil- 
dren as a group differ from hearing 
children as a group in the visual per- 
ceptual processes that are specifically 
involved in color-form attitudes. 
Whether this difference has an environ- 


mental or an organismic basis cannot 
be specified at present. Some insight 
into the factors involved in the color- 
responding tendency of deaf children 
might be obtained by a determination 
of the basis for the difference in color- 
form attitude between males and fe- 
males. 


Summary 


A test of color-form attitude was 
administered to 95 deaf children and 
90 hearing children ranging in age 
from eight to 12 years, and also to 32 
hearing nursery-school children and 33 
hearing adults. The results indicated 
that the deaf children, as compared 
with their hearing peers, showed a 
greater tendency to differentiate visual 
stimuli on the basis of differences in 
color. Within the deaf group and with- 
in each of the hearing groups, males 
tended to make more color responses 
than did females. There was no change 
in the distribution of color-form at- 
titudes between the nursery-school 
group and the adult group. The tend- 
ency toward color responding in deaf 
children was discussed with reference 
to the sex difference and the lack of 
change in color-form attitudes as a 
function of age. 


Acknowledgement 


The writer wishes to thank Dr. Jo- 
seph Rosenstein for his assistance in 
this experiment. 


References 


1. Baucuman, E. E., The role of the stimu- 
lus in Rorschach responses. Psychol. Bull., 
55, 1958, 121-147. 

2. Brian, Ciara R., and GoopEeNnoucH, 
Frorence L., The relative potency of 





248 Journal of Speech and Hearing Research 


color and form perception at various 
ages. J. exp. Psychol., 12, 1929, 197-213. 

3. Corsy, Martua G., and Ropertson, JEAN 
B., Genetic studies in abstraction. J. 
comp. Psychol., 33, 1942, 385-401. 

4. Gerz, S. B., Environment and the Deaf 
Child. Berkeley: California School for 
the Deaf, 1953. 


5. Honxavaara, SytviA, A critical reevalua- 
tion of the color and form reaction, and 


Speech Discrimination Testing 
with Hearing Young Children 


bm An integral part of the complete audi- 
ological battery in the assessment of hearing 
loss includes evaluation of the subject’s re- 
sponses to speech as a stimulus, using tests 
standardized, largely, on adult populations 
with some normative data on mature and 
articulate children. If the child is immature 
or has an articulation defect, it is impossible 
to evaluate an erroneous response to a PB 
stimulus word in terms of whether he failed 
to hear the stimulus correctly, or whether he 
simply cannot repeat it correctly. If the 
expected responses were nonverbal in charac- 
ter, an erroneous response could be construed 
as a result of a defect of hearing, specifically 
a discrimination loss pathognomonic of coch- 
lear involvement. 

A grou:: of normal hearing preschool and 
young schvol-age children is to be tested. With 
the-subject seated in a sound-treated chamber, 
a series of words (preceded by a carrier 
phrase ‘Show me the ... .’) will be presented 
through a calibrated speech audiometer and 
loudspeaker system. The subject will be re- 
quired to respond to each command by 
designating manually one of several pictures 
on cards within his reach. The key words 
were taken, for the most part, from the 
Thorndike lists of words familiar to children, 
were analyzed and grouped according to 
phonetic makeup, similar to the PB-50 word 


disproving of the hypotheses connected 
with it. J. Psychol., 45, 1958, 25-36. 

6. Larr, A. E., Perceptual and conceptual 
abilities of residential school deaf chil- 
dren. J. except. Child., 23, 1956, 63-66; 
88 


7. McKay, B. Exizasers, An exploratory 
study of psychological effects of severe 
hearing impairment. Ph.D. dissertation, 
Syracuse Univ., 1952. 

8. Watxer, Heten M., and Lev, J., Statis- 
tical Inference. New York: Holt, 1953. 


B RESEARCH NEWS NOTES 


lists. Normative data are to be derived sta- 
tistically from results of a number of such 
tests. Although the technique of pointing to 
picture cards has been reported by Keaster 
and others, this is believed to be the first 
attempt to use it for discrimination testing 
beyond speech reception threshold testing. 


Bernard A. Landes, Ph.D. 
Director, Speech-Hearing Clinics 
- Texas Technological College, Lubbock 


Speech Compression Devices 


B® The Department of Speech of the Uni- 
versity of Arizona is currently investigating, 
in cooperation with the Applied Research 
Laboratory, variots ‘types of communication 
systems. These systems are primarily con- 
cerned with limited band widths and speech 
compression devices. Additional research is 
to be concerned with intelligibility in the 
presence of various types of interference. 
Research completed by 1960 candidates for 
the M.A. degree includes studies on motiva- 
tional devices for testing hearing in children, 
intelligibility and listener reaction of electro- 
larynges, articulation testing procedures, the 
evaluation of therapy techniques with bi- 
lingual children, and evaluation of bone con- 
duction audiometric testing procedures, 


Klonda Lynn, Ph.D. 
Head, Department of Speech 
University of Arizona, Tucson 





S$rpmrowrrciod 


an 





Factors Affecting Thresholds for Short Tones 


ROBERT GOLDSTEIN 


JOAN C. KRAMER 


As the duration of a sound is decreased 
below a certain critical value, the in- 
tensity of the sound must be corre- 
spondingly increased for the sound to 
reach threshold, or to be as loud as or 
as detectable as a given longer sound. 
The majority of previous reports indi- 
cates the critical duration to be ap- 
proximately 150 msec (5, 6, 9, 10, 11, 
12, 13, 15). This ‘trading’ relation be- 
tween duration and intensity is gen- 
erally claimed to be linear. A formula 
describing the relation is: It = C for 
t < 150 msec, where I (intensity) and 
t (some unit of time) are both ex- 
pressed in either linear or logarithmic 
units, and C is some constant. The 
simplicity and exactness of the above 
formula have been challenged by some 
investigators (7, 11, 12, 13, 14), and by 
the results of the present investigation. 
This study, however, was not moti- 





Robert Goldstein, (Ph.D., Washington Uni- 
versity, 1952) is Director, Audiology Section, 
Department of Otolaryngology, Jewish Hos- 
pital of St. Louis. Joan C. Kramer, (MS., 
University of Pittsburgh, 1956) is Clinical 
Audiologist, Central Institute for the Deaf, 
St. Louis. This study was supported by 
Grant B-1796 from the National Institutes of 
Health to the Jewish Hospital of St. Louis. 
An adaptation of the paper was presented at 
the 1959 Convention of the American Speech 
and Hearing Association, Cleveland. 


Volume 3, No. 3 


249 


vated by doubts about the accuracy of 
the above generalities but was planned 
as the basis for an electrophysiologic 
test which would utilize deviations 
from the normal intensity-duration 
function for diagnostic purposes. 

Harris, Haines, and Myers (9), and 
Miskolezy-Fodor (12) have already re- 
ported on the diagnostic use of altera- 
tions in the normal intensity-duration 
function in adults with hearing impair- 
ment. Jerger (10) found that normal 
listeners whose ears were temporarily 
fatigued by intense noise showed an in- 
tensity-duration relation at threshold 
similar to that described by Miskolezy- 
Fodor in patients with loudness recruit- 
ment. 

Derbyshire* suggested that the in- 
tensity-duration function at threshold : 
might be used to’ differentiate periph- 
eral from central auditory impairment 
in young children. He also suggested 
that the intensity-duration function 
could be determined by electroence- 
phalic audiometry (3, 17) in children 
too young to give reliable behavioral 
responses. 

The present study was planned as 
the first in a series of studies to test 


*A. J. Derbyshire, personal communication. 


September 1960 





250 Journal of Speech and Hearing Research 


Derbyshire’s hypothesis and to develop 
a practical technic for its clinical ap- 
plication. A test on young children 
would involve technics and apparatus 
considerably different from those used 
in previous studies on adults. The initial 
study in this series, therefore, was a re- 
determination of the psychophysical 
thresholds as a function of duration in 
normally hearing adults, with the even- 
tual goal determining the following 
criteria for the experimental pro- 
cedures: 


a. Subjects should be naive listeners, and 
must be given no trial period or train- 
ing session. (Thus, they would be 
equivalent to the young children who 
would be tested clinically. In the elec- 
troencephalic tests none of the children 
would be sophisticated listeners. They 
would be tested in a single session, and 
even if retested they would not have 
gained any practice-effect from listen- 
ing to tones while asleep and making 
no conscious responses.) 

b. A method of constant stimuli should 
be employed. (In electroencephalic au- 
diometry a method of constant stimuli 
appears to be the most satisfactory way 
of permitting objective analysis of the 
recordings (3, 17).) 

c. Intensity should be changed by 5-db 
steps. (Routine clinical audiometry em- 
ploys 5-db steps, and electroencephalic 
audiometry, the eventual procedure, 
does not justify smaller intervals with 
technics currently employed (3, 17).) 

d. The same equipment should be used 
that will later be used clinically on 
children. (In this way there will be no 
danger of introducing errors due to 
differences in calibration, in the fit of 
the earphones, and in other matters 
associated with use of equipment.) 


e. Monaural testing should be used be- 
cause the clinical goal is to test the ears 
separately. (In addition, since some 
adults with unilateral central nervous 
system impairment are to be studied, it 
is important to know the effect of uni- 
lateral lesions on the thresholds from 
the contralateral ear, and whether left- 
side lesions can be differentiated from 
right-side lesions.) 


Procedure 


Subjects. The subjects for this study 
were 24 men and 24 women between 
the ages of 18 and 76 years (mean age 
38.6). They were employees and pro- 
fessional staff members of the Jewish 
Hospital and nonprofessional acquaint- 
ances. None was an experienced or 
sophisticated listener as far as psycho- 
acoustic experimentation was con- 
cerned. 

Prospective subjects were screened 
for possible hearing losses by pure-tone 
air-conduction audiometry. The crite- 
rion for rejection was a hearing level 
greater than 10 db for 500, 1000, or 
2000 cps. Actually, no subject selected 
had a hearing level in excess of 0 db at 
1000 cps in the ear tested. Persons with 
known chronic ear disease or central 
nervous system disorder were excluded. 


Apparatus and Stimuli. The acoustic 
signals were 1000-cps tones with rise 
and fall times of 7.5 msec. The dura- 
tions of the tones from onset to termi- 
‘nation were 20, 50, 100, 200, 400, and 
2000 msec. The tones were recorded 
electrically on magnetic tape on an 
Ampex 601 tape recorder. The tape 
then became the signal source for the 
study. 

The same tape recorder used for re- 
cording the signals was used for the 
playback system. The output from the 
recorder (600-ohm output impedence) 
was fed through a Hewlett-Packard 
350 B 600-ohm variable attenuator. The 
acoustic transducer was a Dynalab in- 
sert-type earphone with 150-ohm im- 
pedance at 1000 cps. A matching 
attenuator with an insertion loss of 18 
db was placed between the variable 
attenuator and the earphone. This at- 
tenuator had three settings with the 





. _— a a ae 


A=: -~ fH a5 


ao &. 2. te. &. mm oer 





Goldstein, Kramer: Thresholds for Short Tones 251 


nominal values of 0, 10, and 30 db. 
Throughout all the tests the matching 
attenuator was fixed at the 30-db set- 
ting (actually 28.9 db of attenuation). 
With no attenuation on the variable 
attenuator the acoustic output gen- 
erated in a 2-cc coupler was 83.2 db 
re 0.0002 microbar. 

The acoustic pulses generated in the 
2-cc coupler were photographed from 
a cathode ray oscilloscope. No tran- 
sients or distortions were visible at any 
duration of the stimulus in any of the 
pulses photographed. No_ transients 
were audible (18) even at levels at 
least 80 db above threshold. 

The stimuli were recorded in order 
from longest (2000 msec) to shortest 
(20 msec). That is, all of the stimuli at 
2000 msec were recorded first, then all 
of the stimuli at 400 msec, and so on. 
At least 25 tones at each of the dura- 
tions were recorded, with intervals be- 
tween successive tones varying between 
8 and 15 sec. This interval permitted 
the tester to record the response, or 
lack of one, and to readjust the variable 
attenuator according to a_predeter- 
mined schedule. Intervals were stag- 
gered so that a subject would not be 
conditioned to respond after a fixed 
period of time. 

The stimuli at 2000, 400, and 200 
msec were recorded on one reel, and 
the stimuli at 100, 50, and 20 msec on 
a second reel. Each reel began with a 
continuous 1000-cps tone for calibra- 
tion purposes. 


Experimental Procedure. During the 
experimental sessions a stock earmold 
of appropriate size was attached to the 
earphone and inserted into the ear 
canal. The tested ear and receiver were 
covered by a large, soft cushion at- 


tached to a headband. The other ear 
was covered by an inactive Telephonics 
earphone and cushion attached to the 
same headband. The right ear was used 
in one-half of the subjects and the left 
ear in the other half. 


The subject was instructed simply 
to raise his finger whenever he heard 
a sound, just as he had done in the 
initial screening audiometry. He was 
also told that each of the sounds in the 
first series would be relatively long but 
that in succeeding series the sounds 
would be progressively shorter. He was 
told to respond to the brief pip-like 
sounds, even though they might not 
have tonal quality. 

At each tonal duration 25 stimuli 
were presented, five at each of five 
intensity levels. One level was set at the 
expected threshold based on prelim- 
inary trials on five subjects. A second 
level was 5 db greater, and a third 10 
db greater. The fourth and fifth levels 
were 5 db and 10 db, respectively, less 
than the assumed threshold. Thus, there 
was a range in intensity of 20 db. The 
25 stimuli were randomized with re- 
spect to level. Two different random 
schedules were used, each with one- 
half of the subjects. 

Sometimes a subject responded at 
the weakest level used. In these in- 
stances, the entire schedule of 25 stim- 
uli was shifted downward (weaker 
sounds) by 5 or 10 db, depending upon 
the certainty of responses after the 
second of the weakest tones. Naturally, 
more than 25 stimuli would have to be 
given at that particular duration but 
only responses to the last 25 stimuli at 
the adjusted levels were considered in 
determining threshold. When a subject 
failed to respond at the strongest level, 





252 Journal of Speech and Hearing Research 


TaBLE 1. Results of analysis of variance across 
all subjects. 











Source of Variation df ms F 

Age (A) 1 718.84 5.85* 

Sex (S) 1 19.53 

Ear (E) 1 282.03 2.30 

SxA 1 38.28 

ExA 1 0.09 

SxE 1 0.09 

SxExA 1 208.42 1.70 

Pooled Error (PE) 40 122.83 

Duration (D) 5 1625.16 107.18t 
x 5 4.46 

DxS 5 72.66 4.79t 

DxE 5 8.91 

DxAxS 5 23.07 1.52 

DxAxE 5 2.80 

DxSxE 5 6.13 

DxSxExA 5 2.80 

D x PE 200 15.16 








* Significant at the 5% level. 
{ Significant at or beyond the 1% level. 


the entire series was shifted upward 
(stronger sounds) by 5 or 10 db, again 
depending upon the certainty of no 
response at the second of the strongest 
tones. 

After the completion of the 200-msec 
series, the headband was removed from 
the subject, and the subject was allowed 
to relax while the tester changed reels 
on the tape recorder. The earmold 
was not removed from the ear. 








& 25; @ 

s A 

8 20 

tao e 

g ae 

8 Sr om 

re) —““e. 

ra 

« lOFr ies 

a 

a 

| i 1 
20 50 100 200 400 2000 


DURATION IN MILLISECONDS 


Ficure 1. Mean threshold as a function of 
duration of stimulus for all 48 subjects. 
Threshold for each subject was taken as the 
lowest sound pressure level for which there 
were at least three responses to the five 
stimuli at that level. 


Analysis of Data. Threshold for each 
subject for each duration tested was 
arbitrarily chosen as the weakest in- 
tensity which elicited responses to at 
least three of the five stimuli at that 
level. Threshold at each duration was 
tabulated for each of the subjects. 

The 48 subjects were divided into 
three subgroups, according to sex, age 
(under 40, 40 and over), and ear tested. 
Each subgroup, therefore, contained 
six subjects. 

Three of the subjects in each sub- 
group had one of the schedule of 
stimuli and three had the other. It was 
not necessary to further subdivide the 
subjects on the basis of schedule be- 
cause (a) there was no a priori reason 
to suspect different responses with each 
schedule since both schedules had been 
composed of the same set of stimuli; 
and because (b) results with each of 
the schedules showed almost identical 
figures, the mean thresholds at each 
duration not significantly different by 
a simple ¢ test. 

The data were evaluated by an anal- 
ysis of variance. 


Results 


The analysis of variance (see Table 
1) shows two significant main effects 
and one significant interaction: among 
durations, between the two age groups, 
and between sexes with respect to dura- 
tion. 


Duration. The mean thresholds for 
all 48 subjects are shown in Figure 1. 
(The curves will be referred to as in- 
tensity-duration functions even though 
the ordinate is expressed in sound pres- 
sure level). The analysis of variance 
shows that threshold varies significantly 
with duration of stimulus for the stim- 





2. => 2d thet oa Pe ne ae 


— oUt olf ok 


ma RY) Od %—" wt PR.) tH 2 6145 0—C 





Goldstein, Kramer: Thresholds for Short Tones 253 


< 25t ~S 

9° @. ~~ 

3 20F mE OLDER (240) 

“20 J 

v 7 

° - 

8 ‘5 ie, on 

ro) 4 “ee. ‘ta 
- 

wu 10 YOUNGER (<40) @. ==9 

4 

ad 

a 

Ww 


oo 








1 1 1 1 i 
20 50 100 ©6200 400 2000 


DURATION IN MILLISECONDS 
Ficure 2. Mean threshold as a function of 
duration of stimulus for all 24 subjects in 
each age group. Mean age of younger sub- 
jects is 27.5 years and of older subjects 49.7 
years. 


ulus durations used in this study. 
Furthermore, there is a difference in 
threshold level between adjacent dura- 
tions, even for the longer durations. 
These differences were analyzed ac- 
cording to the technic of Duncan (4). 
Between 400 and 200 msec the differ- 
ence is significant at the 5% level. 
Between all other adjacent pairs the 
differences are significant at the 1% 
level. 


Age. For purposes of analysis the 
entire population was divided into two 
age groups, with an arbitrary dividing 
point of 40 years. The mean age in 
years of the 24 younger subjects was 
27.5 and of the 24 older subjects 49.7. 
Within these two age groups the 12 
younger men had a mean age of 30.1 
and the 12 younger women averaged 
24.9. The 12 older men had a mean age 
of 50.5, and the 12 older women 48.8. 

There is a significant difference (0.05 
level) between thresholds of the two 
age groups. Figure 2 shows that the 
curve for the older group (> 40 years) 
is displaced about 2.5 to 3.0 db above 
the curve for the younger group at all 
durations. There is no age-by-duration 


interaction; therefore, the displacement 
of the curve can be interpreted as 
showing the expected reduced sensi- 
tivity as a function of age. 

Five of the older subjects displayed 
somewhat aberrant reactions which 
were difficult to quantify. At the short- 
est duration (20 msec) each of these 
subjects required a loud sample, some 
30 or 40 db above their threshold, be- 
fore they were aware of the kind of 
sound to which they were to respond. 
Once oriented, four of the five then 
gave responses close to the over-all 
group threshold for 20 msec. The fifth 
required the very much stronger signal 
throughout the particular series. One 
older subject needed a strong orienting 
signal at 50 msec as well. 


Sex. Analysis fails to show a sig- 
nificant difference between men and 
women as far as over-all responses are 
concerned. There is, however, a sig- 
nificant (0.01 level) duration-by-sex 
interaction, which can be seen graphi- 
cally in Figure 3. When comparison of 
thresholds is made at each duration (4) 
the difference between men and women 


ol 
oOo 
T 


iad ~ 
oO on 
T T 
4 
a 
a 
tf 


a 
T 


~-. 
-- 
-=- 


SPL RE: 0.0002 MICROBAR 


35 
' 
< 
o 
= 
m 
z 
» 
’ 
/ 
@o 


on 





1 1 Ll l 1 
20 50 100 200 400 
DURATION IN MILLISECONDS 


2000 





Figure 3. Mean threshold as a function of 
duration of stimulus for all 24 subjects of 
each sex. Mean age of men is 40.3 years; of 
women, 36.9 years. 





254 Journal of Speech and Hearing Research 


is significant only at 2000 msec (in 
favor of women) and at 20 msec (in 
favor of men). Figure 3 shows quite 
clearly that there is a gradual transpo- 
sition of the curves with the intersec- 
tion at about 60 msec. 


Ear. No significant differences in 
thresholds emerge between right and 
left ears. This was obvious from in- 
spection of the data prior to the sta- 
tistical evaluation shown in Table 1. 


Discussion 


The intensity-duration curve found 
in the present study is not parallel or 
even asymptotic to the duration axis 
at the longest duration (2000 msec) 
investigated. One can predict with fair 
certainty, however, that threshold 
values would not have decreased sig- 
nificantly with further increase in dura- 
tion, even with naive subjects and the 
procedure used in this study, because 
of known limitations on the ultimate 
sensitivity of the ear. 

It is not likely that the difference in 
thresholds between 2000 and 400 msec, 
and between 400 and 200 msec is an 
artifact of the procedure. If anything, 
a practice effect would have acted to 
flatten the curve, that is, would have 
reduced these differences, since the 
durations were presented in order from 
longest to shortest. Plomp and Bouman 
(14) claim that order has no effect on 
the resultant values. 

While the findings in the present 
study are in contradiction to the com- 
mon assumption that threshold sensi- 
tivity does not increase beyond 
approximately 150 msec, they do not 
in fact differ very much from the 
actual results of previous investigations. 
Some of the studies on which this as- 


sumption is based do not report find- 
ings for tones much longer than 200 
msec. Where reactions for longer tones 
were explored (7, 14), greater sensi- 
tivity was shown for the longer tones 
than for 150-200 msec tones. 

Between 200 and 20 msec the curve 
in Figure 1 shows that a 10-fold de- 
crease in duration requires approxi- 
mately a 10-fold increase in the strength 
of sound, 10-db intensity change for a 
10-db time change. Actually, in this 
range the shift is slightly less so that for 
the over-all population the values more 
closely approximate those given by 
Miskolezy-Fodor (11, 12). Therefore, 
in the range from 200 to 20 msec, the 
threshold for duration T2 can be pre- 
dicted from the threshold for the longer 
duration T, by the formula Ne db = 
Nidb + 9 logio Ti. 

Ts 
The total curve most nearly falls in 


line with the formulae given by Green, 
Birdsall, and Tanner (7): It?/? = C, for 


-t< Ty, (values for T; between 9.9 and 


19.6 msec at 1000 cps); It Cy, for 
Ti <t< To (for To 107.7—276.3 
msec); It!’?—C,; for T2<t. In the 
present study the It = Cz seems to ap- 
ply between about 30 and 180 msec. 
It’/2 = C, seems to apply below 30 msec 
and It!/? = C; above 180 msec. If only 
the duration of maximum or peak ener- 
gy is considered (8) then the nominal 
20-msec tone is only 5 msec, and the 
nominal 50-msec tone is only 35 msec. 
This adjustment makes the values in the 
present study more nearly like those of 
Green, Birdsall, and Tanner. 

Sex differences reported here have 
not been described previously. With 
few exceptions the sex of normal sub- 
jects in other studies has not been given 





WD 


—_ 
= 





—_ Fs 


Av 


)~ 


~~ 


Goldstein, Kramer: Thresholds for Short Tones 255 


in the reports, and no breakdown of 
results has been given as a function of 
sex. 

The differences in the intensity- 
duration curves, as a function of sex, 
limits the value of previous reports 
(9, 12) on the clinical applicability of 
deviations from the normal curves. 
What may be abnormal values for a 
woman may be within the range of 
normal values for men. No reason is 
apparent for the differences between 
men and women. 

In that a sex difference has now 
been shown for a voluntary behavioral 
response as well as for involuntary 
electrophysiologic responses (1, 2, 16), 
it is suggested that some generalities 
about perception and _ conditioning 
must be re-evaluated with more con- 
sideration given to individual differ- 
ences among the normal subjects. 
Normal subjects either randomly or 
selectively chosen cannot necessarily 
be considered a homogeneous group. 


Summary 


Psychophysical thresholds for tones 
of various durations were determined 
on 48 normal adults. The stimuli were 
bursts of 1000-cps tones with 7.5-msec 
rise and fall time. Durations measured 
from onset to off of the tones were: 
2000, 400, 200, 100, 50, and 20 msec. 

Overall there was the expected ap- 
proximately linear ‘trading’ relation 
between time and intensity as duration 
increased from 20 to 200 msec. Thresh- 
olds continued to get lower, although 
at a slower rate, for durations longer 
than 200 msec. 

The intensity-duration curve was 
steeper for women than for men. 
Women had lower thresholds at 2000 


msec, and men lower thresholds at 20 
msec. There were no significant dif- 
ferences for the intermediate durations. 

The subjects 40 years old and older 
had 2.5 to 3.0 db higher thresholds 
than the subjects less than 40 years old 
at all durations. 


Acknowledgment 


The authors gratefully acknowledge 
the assistance of Dr. Jerome R. Cox, 
Mr. Jules A. Detchemendy, and Dr. Ira 
J. Hirsh, Central Institute for the Deaf, 
in preparing the apparatus and stimuli 
used in this study; and of Dr. Robert 
C. Bilger, Speech Department, Uni- 
versity of Michigan, in the statistical 
analysis of the data. 


References 


1. Berry, J. L., and Martin, B., GSR re- 
activity as a function of anxiety, instruc- 
tions, and sex. J: abnorm:. (soc.) Psychol., 
54, 1957, 9-12. 

. Cuarkuin, J. B., The conditioned GSR 
auditory speech threshold. J. Speech 
Hearing Res., 2, 1959, 229-236. 

3. DersysuireE, A. J., and Fartey, J. C.,, 
Sampling auditory responses at the corti- 
cal level. Ann. Otol., etc., St. Louis, 68, 
1959, 675-697. 

4. Duncan, D. B., Multiple range and 
multiple F tests. Biometrics, 11, 1955, 1- 
42. 

. Garner, W. R., The effect of frequency 
spectrum on temporal integration of 
energy in the ear. J. acoust. Soc. Amer., 
19, 1947, 808-815. 

6. Garner, W. R., and Mitter, G. A., The 
masked threshold of pure tomes as a 
function of duration. J. exp. Psychol., 
37, 1947, 293-303. } 

7. Green, D. M., Birpsati, T. G., and Tan- 
NER, W. P., Jr., Signal detection as a 
function of signal intensity and duration. 
J. acoust. Soc. Amer., 29, 1957, 523-531. 

8. Harris, J. D., Peak vs. total energy in 
thresholds for very short tones. Acta 
Otolaryng., 47, 1957, 134-140. 

9. Harris, J. D., Hares, H. L., and Myers, 
C. K., Brief-tone audiometry. Arch. Oto- 
laryng., Chicago, 67, 1958, 699-713. 


nN 


wv 





256 Journal of Speech and Hearing Research 


10. 


13. 


Jercer, J. F., The influence of stimulus 
duration on the pure-tone threshold dur- 
ing recovery from auditory fatigue. 
School Av. Med., USAF, Report No. 
55-19, 1955. (PB 119410) 


. Misxoiczy-Fopor, F., Monaural loudness- 


balance-test and determination of recruit- 
ment-degree with short sound-impulses. 
Acta Otolaryng., 43, 1953, 573-595. 


. Misxotczy-Fovor, F., The relation be- 


tween hearing loss and recruitment and 
its practical employment in the determi- 
nation of receptive hearing losses. Acta 
Otolaryng., 46, 1956, 409-415. 

Misxotczy-Fopor, F., Relation between 
loudness and duration of tonal pulses. 
I. Response of normal ears to pure tones 
longer than click-pitch threshold. J. 
acoust. Soc. Amer., 31, 1959, 1128-1134. 


14. 


16. 


17. 


18. 


Promp, R., and Bouman, M. A., Relation 
between hearing threshold and duration 
for tone pulses. J. acoust. Soc. Amer., 31, 
1959, 749-758. 


. Pottack, I., Loudness of periodically 


interrupted white noise. J. acoust. Soc. 
Amer., 30, 1988, 181-185. 

RoseEnBLwt, B., Bircer, R. C., and Gotp- 
sTEIN, R., Electrophysiologic responses to 
sound as a function of intensity, EEG 
pattern and sex. J. Speech Hearing Res., 
2, 1959, 28-39. 

Witnrow, F. B., Jr., and Goxpstei, R., 
An electrophysiologic procedure for de- 
termination of auditory threshold in chil- 
dren. Laryngoscope, 68, 1958, 1674-1699. 
Waicut, H. N., Switching transients and 
threshold determination. J. Speech Hear- 
ing Res., 1, 1958, 52-60. 








Agrammatism and 


Inflectional Morphology in English 


HAROLD GOODGLASS 


JEAN BERKO 


Many aphasic patients, while beginning 
to recover a considerable speaking vo- 
cabulary, ‘continue to omit articles, re- 
lational words, and inflectional endings 
from their speech’These patients, speak- 
ing in disconnected words rather than 
sentences, are said to exhibit ‘agram- 
matism’ or ‘telegraphic speech} The 
phenomenon of agrammatism has been 
described by many clinicians in the 
classical literature on aphasia; it con- 
tinues to be cited by Goldstein (3), 
Weisenburg and McBride (12), Brain 
(2), Wepman, Bock, Jones, and Van 
Pelt (13) as part of the syndrome of 
motor (or expressive) aphasia, as op- 
posed to the speech pattern in the 
sensory and amnesic types of aphasia. 
Pick (8) undertook an analysis of gram- 
matical disturbances in aphasia in 1913, 
but the problem then lay dormant for 
many years. Agrammatism and tele- 
graphic speech have remained essen- 
tially subjective terms, based on clinical 





Harold Goodglass (Ph.D., University of 
Cincinnati, 1951) is Assistant Chief, Clinical 
Psychology Section, Veterans Administration 
Hospital, Boston. Jean Berko (Ph.D., Rad- 
cliffe College, 1958) is with the Communica- 
tions Research Division, Institute for Defense 
Analyses, Princeton, N. J. This investigation 
was supported in part by fellowship MF9261, 
from the National Institute of Mental Health, 
Public Health Service, and in part by a grant 
from the Social Science Research Council. 


Volume 3, No. 3 


257 


impression, in spite of their potential 
importance for clinical diagnosis and 
neuropsychological theory. 

Recently, a series of contributions by 
Jakobson (6), Wepman, Bock, Jones, 
and Van Pelt (13), Goodglass and May- 
er (5), and Luria (7)(have applied 
psycholinguistic concepts, and methods 
in the effort to define and make meas- 
urable the deficit which produces the 
symptoms of agrammatism. Jakobson 
applied the term ‘contiguity disorder’ 
to this phenomenon and stated the 
symptoms in linguistic terms as follows: 
(a) reduction in the variety of sen- 
tences, (b) loss of syntactic rules, (c) 
dissolution of ties of grammatical co- 
ordination and subordination, (d) loss 
of words with purely grammatical 
function and loss of inflectional endings. 
Jakobson thus implicates inflections of 
grammar, which are obligatory in Eng- 
lish, as well as syntactic arrangements, 
which are not so highly coded by the 
language. (Contiguity disorder and 
agrammatism thus refer to language dis- 
turbances that more specifically fall 
under the linguist’s headings of ‘mor- 
phology’ and ‘syntax.’\Syntax here 
refers to the arrangement of words into 
sentences, and morphology includes the 
study of the rules for suffixes that are 
used to indicate grammatical relation- 
ships between words. ) 


September 1960 





258 Journal of Speech and Hearing R 


Goodglass and Mayer (5) selected 

five agrammatic and five nonagrammatic 
aphasics on the basis of clinical judg- 
ment. They found that these groups dif- 
fered sharply with respect to errors 
with syntactic constructions (that is, 
positioning of words by grammatical 
function), and differed much less in 
their tendency to omit or confuse in- 
flectional endings and the ‘small words’ 
of grammar.) ) These findings suggested 
that the morphological aspect of agram- 
matism might well be studied separately 
from the syntactical. Goodglass and 
Hunt (4) compared the ability of 
aphasics to answer questions correctly 
with words ending in a plural ‘-s’ as 
opposed to a possessive ‘~’ .” They also 
required their subjects to judge the cor- 
rectness of tape recorded sentences 
from which either a final plural ‘s’ or a 
final possessive ‘-’s’ was omitted. The 
possessive was clearly more difficult 
than the plural, both in the expressive 
and the auditory-receptive parts of the 
experiment.” 
CThe present study is an investigation 
of the aphasic individual’s ability to 
produce orally common English words 
with inflectional endings appropriate 
for the completion of English sentences,» 
It affords some interesting comparisons 
with Berko’s (1) study of this ability 
in preschool children. The principal 
questions to be answered by the data 
were conceived as follows: 

a. Does there appear to be, in gen- 
eral, an order of difficulty for the 
English inflectional forms among apha- 
sics and, if so, how does this order 
compare with the difficulty of these 
forms for children? 

Ub. Is there any evidence from cor- 
relational data that a common factor 


esearch 


governs the use of the various inflec- 
tional morphemes? 

c¢; What relation can be discovered 
between impaired use of any of the 
inflectional morphemes and other, in- 
dependently measured, aspects of apha- 
sic disturbance? 

(T he English inflectional items investi- 
gated were the regular forms of the 
plural and possessive of the noun, the 
simple past and the third person singular 
present indicative of the verb, and the 
comparative and superlative of the ad- 

* jective, to make a total of 10 inflectional 
morphemes. These items include all of 
English inflection with the exception of 
the progressive ‘ing’ form of the verb. 
All of these inflectional forms, except 
the comparative and superlative, occur 
in one of three differing forms, or allo- 
morphs, depending on the last phoneme 
of the stem of the word; they are there- 
fore said to be phonologically con- 
ditioned. These allomorphs are as 
follows» 

a. The third person present singular 
indicative of the verb, the noun plural, 
and the possessive singular, phonologi- 
cally identical and formed by [-az], 
[-s], or [-z]. 

b. The past tense of the verb, 
formed by [-ad], [-t], or [-d]. 

The comparative and superlative 
of the adjective, formed by [-ar] and 
[-ast], respectively, endings which have 
no variants. 

Ideally, a test aimed at the problem 

» under consideration here uses nonsense 
words. That is, if a subject says that 
the plural of ‘watch’ is ‘watches,’ it is 
always possible that he has no inter- 
nalized rule for the formation of the 
plural, that his production represents 
a specific memorized form, and that 





Vi 





Goodglass, Berko: Agrammatism, Inflectional Forms 259 


he may be unable to form the plural 
of a word he has never before heard. 
Thus if the request is for the plural of 
a nonsense form like ‘gutch,’ these dif- 
ficulties are overcome. The subject who 
produces ‘gutches’ demonstrates the 
possession of an abstract rule. Berko (1) 
used such nonsense forms in her exper- 
iment with preschool children. In the 
present experiment with aphasic pa- 
tients, real words were used because it 
ewas found that in the majority of 
instances the individual was unable or 
unwilling to complete a sentence con- 
taining nonsense words. Since inflections 
are obligatory in English, it was pos- 
sible to construct sentences that 
necessitated their use, and the indi- 
vidual’s deficits were readily measured. 


Procedure 


Subjects. The subjects were 21 apha- 
sic patients in the neurological wards 
of the Boston Veterans Administration 
Hospital and in the speech therapy pro- 
gram at Lemuel Shattuck Hospital.’ 
They ranged in age from 24 to 65 
years, with a mean of 48.1. Three were 
college graduates, including one physi- 
cian; seven others had completed high 
school. |The criterion for inclusion of 
a subject was that he have sufficient 
speech to complete the test sentences 
with scorable responses, but sufficient 
residual aphasia to make two or more 
errors., Two patients with minimal resi- 
dual aphasia were excluded because 
they made no errors at all. One patient 
with a moderate anomic aphasia also 
was excluded because she made no er- 
rors in the experimental task. 


"Testing of patients at Lemuel Shattuck 
Hospital was carried out by Miss Mary Hyde, 
Speech Therapist. 


Tests Used. A 60-item sentence com- 
pletion test was composed to include* 
six opportunities for the use of each of 
the 10 inflectional morphemes chosen 
for study. The test was designed to be 
given in two equivalent sections of 30 
items each. The 10 morphemes were 
plural [-z] or [-s]; plural [-oz]; past 
[-t] or [-d]; past [-od]; present [-s] or 
[-z]; present [9z]; possessive [-s] or 
[-z]; possessive [-az]; comparative 
[-ar]; superlative [-ast]. An example 
of an item (simple past) follows: ‘It 
rains pretty often around here. It did 
rain last night. Last night, it _____.’ 

Each sentence was read aloud by the 
experimenter with a natural or slightly 
exaggerated phrasing and intonation, in 
order to elicit the final word from the 
subject. The initial sound of the re- 
sponse word was supplied as a starting 
cue for subjects who needed this assist- 
ance, and the score of plus or minus 
depended solely on the correctness of 
the inflectional suffix. In some instances 
subjects misunderstood the item or gave 
a paraphrased answer and it was felt 
injudicious to press them to listen once 
more to the item. For this reason 18 
(1%) of the responses could not be 
scored as right or wrong. These items 
were arbitrarily assigned plus or minus 
scores in alternation as they happened 
to fall on the data sheet. These am- 
biguous responses were evenly scat- 
tered; no test sentence elicited more 
than one unscorable response among 
the 21 subjects and no inflectional cate- 
gory (each involving 126 responses) 
had more than two. The error thus in- 
troduced is insignificant and random. 
“In the listing of test items it will be 
noted that final [-s] items are scored 
together with final [-z] items, and that 





260 Journal of Speech and Hearing Research 


final [-d] items are scored together with 
final [-t] items. The use of the unvoiced 
form in each of these cases does not 
depend on knowledge of the inflection, 
but is determined by prior phonologi- 
cal rules which are universal for Eng- 
list) That is, the final alveolar stop 
following [p, k, ¢, f, , s, 8] is unvoiced 
and so is the final sibilant following 
[p, t, k, f, 8]\ The assumption was made 
that the voiced and unvoiced forms 
were homogeneous in difficulty, A fur- 
ther assumption in the use of this test 


vis that the six samples of each type of 


inflection are homogeneous in difficulty 
and that they reliably represent the level 
‘of difficulty of the entire class of in- 
flections to which they belong. The 
basis for this assumption was that the 
task for the subject was phonologically 
and grammatically the same in all six 
samples of each type of inflectional 
ending. It was felt that differences in 
difficulty among the stems of the re- 
sponse words did not appreciably in- 
fluence scores since, with the help of 
a cue, if necessary, the subjects were 
capable of supplying the stem in every 
case. The validity of this assumption 
concerning homogeneity is examined 
statistically under Results. 


As part of their examination routine, 
15 of the subjects had had an aphasia 
examination which included objective 
ratings of effective ‘Functional Speech’ 
and articulation. These subscores served 
as criteria of impairment in aspects of 
language performance apart from word 
endings. The Functional Speech score 
is obtained from a five-point rating, 
based on the Interjectional Speech and 
Spontaneous Expository Speech sub- 
tests of the Boston VA Hospital Diag- 


nostic Aphasia Test.? This portion of 
the examination consists of a structured 
conversation, ranging from expletives 
and conversational automatisms to ex- 
tended propositional utterances, includ- 
ing a narrative about a picture-situation. 
Subjects’ performance was scored on a 
scale from 0 to 4 with 0 indicating no 
impairment and 4 indicating severe to 
total aphasia. 

The rating scale was originally vali- 


dated on 25 patients against the judg-- 


ment of two clinicians who knew the 
patients well through testing and 
therapy. It has since been applied to 
more than 100 tested subjects with 
satisfaction that the objective scoring 
standard coincided with clinical judg- 
ment, based on extended observation, 
as to the patients’ over-all verbal effi- 
ciency. Statistical evidence, though still 
scanty, supports the use of the Func- 
tional Speech rating as an index of the 
over-all severity of aphasic speech 
defect. In a study with 20 aphasic 
patients, rank order correlation coeffi- 
‘cients obtained between Functional 
Speech and each of two subtests, Com- 
mands and Word Finding, were .71 and 
.72, respectively; intercorrelation of the 
two latter tests was only .49. The 
Functional Speech score is also based on 
performance closely similar in content 
to the two subtests (Giving Information 
and Picture Description) which Schuell 
and Jenkins (11) found to be tied for 
the second highest phi coefficient 
among 29 subtests which were tested 
for correlation with their full battery. 

*The term ‘Functional Speech’ applied to 
the rating of a structured conversation test 
and the principle of scoring several levels of 
speech age oh gs on a picture-situation 
were taken from the Minnesota Test for 


Differential Diagnosis of Aphasia by Schuell 
(10). 








Goodglass, Berko: Agrammatism, Inflectional Forms 


261 


Taste 1. Errors made by 21 aphasics in use of 10 inflectional endings on sentence completion test in 


which each ending appeared six times. 














Class of Number of Subjects Total Mean Errors 
Inflection Failing each Itent Errors per Subject 
Plural [-s, -z] 4, 0, 6, 7, 5, 4 26 1.2 
Plural [-a2] 7, 3, 5,3, 5,4 27 
Past [-t, -d] 10, 9, 10, 10, 9, 6 54 2.6 
Past [-ad] Be, 152-7, 5, 3 50 2.4 
Present [-s, -z] 75 95 5; 45.10, 11 46 22 
Present [-az] 10, 11, 11, 10, 9, 11 62 3.0 
Possessive [-s, -z] TE; 10; 10; 6; 12,9 58 2.8 
Possessive [-az] 12, 15, 10, 16, 14, 16 83 4.0 
Comparative [-ar] 4, 8, 3, 6, 5, 4 30 1.4 
Superlative [-ast] LONG, 75. Fy. (O57 43 2.0 
Total 479 22.8 








The articulation or Verbal Agility : 


subtest required the rapid reiteration 
of a series of test words. Either one or 
two points per item could be earned, 
depending on the number of repetitions 
in a five-second span. Test words were 
presented both orally and visually, and 
timing of each word did not begin until 
the subject had succeeded in saying it, 
or clearly could not master its articula- 
tion. This task was designed as a wide 
range test of articulatory facility, as 
independent as possible of difficulties in 
auditory comprehension and in word- 
finding. It was included in the present 
experiment because of the impression 
gained during preliminary investigations 
that subjects who performed easily on 
the experimental task were facile in 
their articulation. 


Results and Discussion 


Order of Difficulty of the Inflec- 
tional Endings for Aphasic Subjects. 
The difficulty of each type of 

~ inflectional ending for the aphasic sub- 
jects was measured by the number of 
errors on the six items representing each 
ending.. Table 1 summarizes the data. 


From the column which gives the num- 
ber of subjects failing each item it is 


possible to get a rough estimate of the - 
uniformity of the difficulty of items 
within a class, as compared to the dif- 
ferences betwee’: classes, The Cochran 
Q test, as described by Siegel (12, pp. 
161-166) was applied to each of the 
10 sets of error scores. The null hypo- 
thesis in this test is that the items have 
equal probability of being passed. Re- 
jection of the null hypothesis would 
indicate that the items in a set are 
clearly heterogeneous in difficulty. The 
null hypothesis could not be rejected 
at the .05 probability level in any case 
except that of the past [-od], where it 
could be rejected at the .01 level. 
Homogeneity of item difficulty thus- 
may be assumed for all but one of the 
10 sets. 

Inspection of the several widely dis- 
crepant items suggests the possibility 
that some, but not all, of the easiest 
ones were more probable in English 
conversation than the others. On the 
whole, however, the assumption of 
homogeneity of difficulty within classes 





262 Journal of Speech and Hearing Research 


Taste 2. Significance level of differences in number of errors made, each inflectional ending 
having the larger error score (listed from highest to lowest) compared to each ending having 
the lesser error score of pair. Figures are based on the application of the Wilcoxon signed 
ranks test (two-tailed) to paired arrays of error scores of the 21 subjects. 





Inflectional Ending 
(Larger Error Score) 


Inflectional Ending (Lesser Error Score of Pair) 





Present Poss Past Past Present Superl Compar Plur Plur 

[-0z] [-s, -z] [+t, -d] [-od] [-s, -z] [-ast] [-ar] [-2z] [-s, -2] 
Possessive [-2z] 05 01 05 01 01 01 01 01 Ol 
Present [-22] ns* ns ns ns 05 02 01 01 
Possessive [-s, -2] ns ns ns ns 05 02 Ol 
Past {-t, -d] ns ns ns 05 01 Ol 
Past [-od] ns ns 05 01 01 
Present [-s,-2] | 4 ns ns ns 01 
Superlative [ar] test-! ns ib 0S 
Comparative {sous te ¢.) ns ns 


Plural [-az] 





*Not significant. 


appears to have been justified. Consider- 
ation of the significance of differences 
between the error totals is therefore a 
legitimate next step. 

Differences in error totals were tested 
for significance by means of the Wil- 
coxon signed ranks test, with results 
summarized in Table 2. It will be noted 
that the inflectional eNdings fall into 
at least three distinguishable groups with 
respect to difficulty, with the complex 
possessive by far the most difficult, 
the comparative and the two forms of 
the plural by far the easiest, and the 
remaining six occupying a middle range 
) \The decisive importance of gram- ° 

matical function over phonological 

structure in determining the difficulty 
of an inflectional ending is illustrated 
by those items in which exact hom- 
onyms were used in different grammat- 
ical settings) For example, ‘horses’ as 

a noun plural was failed by only three 

subjects in the item “The millionaire 

bought a new horse. He now has a 

whole stable full of : . The 

possessive form ‘horse’s’ was failed by 

15 subjects in the item ‘This blanket 

is for the horse. Whose blanket is it? 








It is the ” ‘Watches’ as a 
noun plural was failed by five subjects 
in the item “The doctor has a wrist 
watch and a pocket watch and a stop 
watch. He certainly has a lot of 
—_______.” The verb form ‘watches’ 
was failed by 11 subjects in the item 
‘John likes to watch while Tom draws 
pictures, so Tom draws and John___.’ 

The interpretation of the differences 


.between the error scores may be lim- 


ited by systematic differences in the 
probability structure of the incomplete 
sentences which were used to elicit the 
responses. That is, the order of diffi- 
culty of inflectional forms in free 
conversation is not necessarily the same 
as that obtained under the experimental 
conditions(It may be pointed out, for 
example, that the last word in the 
plural-demanding sentences is often pre- 
ceded by a strong cue for a plural 
noun, such as ‘a lot of ...’ or the plural 
form of the verb. Yet, the item which 
had no such strong cue ‘My rose bush 
is in bloom. It is all covered with 
beautiful red is just as 


easy, on the average, (five errors) as 
the other five items in its class.’ 


~ 





in 
wi 


V 


su 
ye 
cl 


gr 


scl 


no 
at 
as 
ha 
bil 


€al 


for 
thi 
on 





Goodglass, Berko: Agrammatism, Inflectional Forms 263 


A strong predominance in errors with 
the possessive was found also by Good- 
glass and Hunt (4). They used a stim- 
ulus structure that was apparently free 
of bias in the transitional probabilities 
of the words in the plural as compared 
to the possessive items. Their items 
were in the form: “The dog chewed up 
my sister’s gloves. (Read twice by 
examiner.) What did the dog chew up? 
Whose gloves were they?’ It is there- 
fore suggested that the differences in 
level of difficulty for the various inflec- 
tional endings, for aphasic subjects, 
cannot be dismissed as artifacts of the 
particular sentence structures chosen 
for the test items. 


(Comparison with Performance of 
Nonaphasic Subjects. The sentence 
completion tests used in the present 
study were given also to 15 brain- 
injured nonaphasic individuals who 
were neurological patients at the Boston 
Veterans Administration hospital They 
were somewhat older than the aphasic 
subjects, ranging in age from 39 to 65 
years with a mean age of 52.6. They 
closely resembled the aphasic subjects 
in educational attainment; two had 
graduated from college, four from high 
school. The two groups were not 
matched on performance IQs bunt the 
nonaphasic individuals appeared to have 
at least as much intellectual impairment 
as the aphasic subjects. The majority 
had right-sided brain damage or diffuse 
bilateral disease; two had been aphasic 
early in their illness. 

Of the nonaphasic individuals, 10 per- 
formed with no errors at all, two made 
three errors, two made two errors, and 
one made one error.\All five who made 
errors omitted the complex possessive 
[-2z]; this error occurred seven times», 


/ 


Three of the five omitted a simple 
possessive, as well, and one twice sub- 
stituted a past tense for the third person 
present. None of these five had ever 
been considered aphasic. The most 
grossly deteriorated of these brain-in- 
jured individuals, who was confused, 
disoriented in time and place, and almost 
devoid of memory, performed without 
error under the conditions of the exper- 
iment. It is interesting to note that the 
items occasionally failed by the non- 
aphasics were also the most difficult 
for the aphasic subjects. 


Comparison with Data on Children. 
According to Rapaport’s (9, p. 186) 


‘summary, Ribot’s rule holds that or- 


ganic defects, such as aphasia, injure 
the latest learned patterns before they 
affect the earliest learning;\One might 
therefore expect the aphasic’s loss to 
mirror the pattern of the child’s acquisi- 
tion: that is, that the forms most diffi- 
cult for the aphasic should be the ones 
acquired latest by the children} ( Data 
in the present study indicate that the 
pattern of aphasic deficit in English 


inflectional morphology only partially’ 


resembles the pattern of the child’s 
learning, as found by Berko (1). For 
example, Berko found that children 
regularly have more difficulty with the 


phonologically complex® [-oz] and* 


[-ed] than with the simpler [-s, -z] or 
[-t, -d] allomorphs in all the gramma- 
tical functions in which these endings 
are used) Percentages of correct re- 
sponses of 80 children (four to seven 


*The expression ‘phonological complexity’ 
here refers to the fact that the [-ez] and 
[-ed] forms apply to fewer cases and consti- 
tute exceptions to more general (hence ‘sim- 
pler’) rules; it does not mean that these forms 
are harder to pronounce than their simple 
allomorphs. 





264 Journal of Speech and Hearing Research 


TaBLe 3. Percentage of correctly inflected nonsense words supplied by 80 children, aged four to seven, 


as reported by Berko (1). 











Inflectional Required Percent 
Class Response Correct 
Plural [-z] wugs 91 
Plural [-az] tasses 36 
nizzes 28 
Possessive [-z] wug’s 84 
Possessive [-2z] niz’s 49 
Present [-az]* loodges 56 
nazzes 48 
Past [-d] binged li) 
Past [-ad] motted 33 








*The simple [-z] form of this inflectional morpheme was not sampled. 


years) to a sentence completion test, 
using nonsense words, are given in 
Table 3. For aphasics the corresponding 
difference in difficulty of the complex 
over the simple allomorph was found 
to a significant degree for the posses- 
sive; the difference between the simple 
.and complex allomorphs of the third 
person singular also tends in this direc- 
tion; however, there is no difference, 
for aphasics, between the two forms of 
’ the plural or past tense endings.~The 
errors of aphasics, as compared to chil- 
dren’s, are much more influenced by 
morphemic differences than by phono- 
logical complexity. Jn some instances 
aphasics were unable to supply any of 


the allomorphs of a given inflectional 
morpheme, regardless of phonological 
simplicity. For example, three aphasics 
failed all of the possessive ending in 
the simple [-z] form. In the Berko 
study four-year-old children were con- 
sistently able to supply a simple pos- 
sessive ending> 

(In making the comparison with 


‘children, it should be recalled that the 


aphasics were required to supply actual 
English forms, the children, on the 
other hand, had to demonstrate their 
generalization of the inflectional rules 
to nonsense words: Berko reports two 
instances in which real English words 
were to be supplied by the children. 


Taste 4. Intercorrelations between the error scores for the 10 inflectional endings. (With 21 
subjects a correlation coefficient of .37 is required for significance at the .05 level.) 








Plur Past Past Pres Pres Poss Poss Compar Superl Combined 

[-az] [-t, -d] [-ed] [-s, -z] [-2z] [-s, -z] [-2z] [-er] {-ast] Error Score 
Plural j{-s, -z) 45 34 50 65 63 65 33 AS 53 67 
Plural [-0z) 35 72 54 65 40 46 60 53 79 
Past [-t, -d] 32 03 16 -18 -.02 26 2 43 
Past [-2d] A2 56 40 52 4B 81 70 
Present [-s, -z] 72 62 41 03 47 63 
Present [-z] 62 60 19 61 .74 
Possessive [-s, -z] .75 8 33 60 
Possessive [-2z] .28 35 56 
Comparative [-ar} 21 41 


Superlative [-ast] 








ee me? oe LOS ee Se ee 


Vv 
l 


Si 
lu 











The past tense ‘melted’ was supplied by 
73% of her children as compared to 
33% for the nonsense form ‘motted’; 
the plural ‘glasses’ was supplied by 91% 
as compared to 36% for the nonsense 
form ‘nizzes.’ 


Correlations among the Subscores. 
Rank order correlations were computed 
among the 10 arrays of error scores, 
with the results listed in Table 4. As a 
measure of the agreement between each 
of the subscores and their combined 
total, the rank order correlation was 
computed between each of the sub- 
scores and the total error scores of the 
21 subjects. For this purpose the total 
error score was summed separately for 
each computation, omitting the score 
with which the total was being corre- 
lated. These correlations are also in- 
cluded in Table 3. Because of the small 
number of subjects, it would be rash 
to draw conclusions from any but the 
grossest differences between correla- 
tions. *There appears to be a common 
factor contributing to the error scores 
of all the inflectional morphemes, with 
the possible exception of the simple 
past. 


Relationship with Other Measures of 
Severity of Aphasia. The total error 
score on the experimental task was 
tested for correlation with the Func- 
tional Speech score and the Verbal 
Agility (articulation) score, which 
were described earlier. Unfortunately, 
these scores were available for only 15 
of the 21 subjects. The obtained rho 
of .32 between Functional Speech and 
the total inflectional error score is well 
below the level required for statistical 
significance. This low correlation is il- 
lustrated clinically by the subject rank- 


Goodglass, Berko: 





Agrammatism, Inflectional Forms 265 


ing fifth in the experimental task, who 
nevertheless had such a severe expressive 
aphasia that he could initiate practically 
no speech, although he could repeat 
words or short phrases. The patient 
ranking highest on the experimental 
task obtained a Functional Speech rat- 
ing at the ‘moderately severe’ level. 


For the same group of 15 subjects, a 
correlation of .69 was obtained between 
the Verbal Agility subtest and the total 
inflectional error score. This correla- 
tion accords with the clinical impres- 
sion, gained in exploratory study, that 

patients who articulate individual words 
easily also have few omissions of 
inflectional endings.) 


Among the subjects who did well in 
the experimental task were some who 
had extreme word-finding difficulty 
and some whose speech was essentially 
limited to one-word sentences. That is, 
among the subjects who would be 
called clinically agrammatic because 
they speak in isolated words or short 
phrases, there are some—usually subjects 
having facile articulation—who are not 
much impaired in the test of gramma- 
tical morphology. It should be noted, 
however, that thed correct syntactical 
context in each test sentence was al- 
ready structured for the subject and 
was extremely important in determining 
the inflections supplied.. This cue is 
absent in the normal speech of the pa- 
tient who lacks a repertory of sentence 
patterns so that the correct inflections 
may be less in evidence under the con- 
ditions provided by their spontaneous 
speechkUnder the present experimental 
conditions, at least, there appears to be 
a degree of independence between the 
morphological and syntactical aspects 
of agrammatism, 





. 


266 Journal of Speech and Hearing Research 


Significance of Findings 
and Research Implications 


The present study has shown that 
aphasic subjects vary widely in their 
ability to supply inflectional endings 
under a particular experimental condi- 
tion; that inflectional endings vary, 
according to their grammatical func- 
_tion, in their availability to aphasics; 
Ct further appears that if one eliminates 
the minimally aphasic and the totally 
aphasic, then the degree of aphasic 
handicap has little predictive value for 
the ability to supply inflectional end- 
ings) On the other hand, facility with 
articulation appears to be positively 
correlated with this ability. However, 
a considerably larger sample of aphasics 
should be tested before either of the 
two latter relationships can be claimed 
with any assurance) 

As yet, there is no information on 
the relation between performance on 
the experimental task and use of in- 
flections in a sample of free conversa- 
tion; neither is the relation known 
between performance on the present 
test and facility with English syntactic 
forms, either in structured tests or in 
free conversation. A larger sample is 
necessary also before it can be deter- 
mined whether any of the inflectional 
endings of the present test are differ- 
entially related to each other or to 
other. diagnostic indicators. The present 
authors propose that continued appli- 
cation of linguistic categories to the 
design of experimental tasks offers the 
most promise for identifying agramma- 
tism more precisely. As greater preci- 
sion is gained in stating operationally 
what is meant by ‘agrammatism’ it 
should be possible to suggest how or 


through what psychological processes 
this defect operates. 

From a comparison of the present 
experimental data with the performance 
of preschool children, it appears that 
no simple parallel can be drawn be- 
*tween the normal acquisition of inflec- 
tional forms and their loss through 
brain injury. However, the order of 
acquisition of regular inflectional end- 
ings by very young children is still 
not known. If they are acquired in a 
definite order, Berko’s data suggest that 
this learning is nearly complete by the 
age of four when children are able to 
handle the more simple and regular 
forms of all English inflectional cate- 
gories. In addition to information on 
the inflectional usage of three-year-olds, 
it would be useful to have a frequency 
count for the occurrence of the various 
English inflectional endings in normal 
adult speech. 


Summary 


. This study was concerned with the 
morphological, as distinct from the syn- 
tactical, aspects of grammatical dis- 
turbance in aphasia. Specifically, it 
investigated the ability of 21 aphasic 
subjects to supply, by means of a sen- 
tence completion test, correct inflec- 
tional endings for nouns, verbs, and 
adjectives. The results suggest the 
following tentative conclusions: 
~ a. For aphasics, the difficulty of var- 
ious inflectional endings follows a 
definite order which is based on gram- 
matical function, not phonological simi- 
larity. 

b. Phonological complexity is not 
as important for aphasics as for children 
in determining the difficulty of inflec- 
tions. 








a 


ms me - aa 


Goodglass, Berko: Agrammatism, Inflectional Forms 


c. A common factor appears to un- 
derlie adequate performance with al! 
inflectional endings studied except the 
simple past. 

d. The inflectional ending score is 
related to verbal agility in articulation, 
but is not related to over-all adequacy 
of speech. It is suggested that, in some 
aphasics, the syntactic and the inflec- 
tional aspects of grammar may be im- 
paired independently of each other. 


References 


1. Berxo, JEAN, ‘The child’s learning of Eng- 
lish morphology. Word, 14, 1958, 150- 
LV 

2. Brain, R., Aphasia, apraxia, and agnosia. 
In S. A. K. Wilson (Ed.), vol. 3, Neurol- 
ogy (2nd ed.). Baltimore: Williams and 
Wilkins, 1955. 

3. Gotpstetn, K., Language and Language 
Disturbances; Aphasic Symptom Com- 
plexes and Their Significance for Medi- 
cine and Theory of Language. New York: 
Grune and Stratton, 1948. 

4. Goopctass, H., and Hunt, J., Grammati- 
cal complexity and aphasic speech. Word, 
14, 1958, 197-207. 


14. 


267 


. Goopetass, H., and Mayer, J., Agramma+ 


tism in aphasia. J. Speech Hearing Dis., 
23, 1958, 99-111. 


. Jaxozsson, R., Two aspects of language 


and two types of aphasic disturbances. In 
R. Jakobson and M. Halle, Fundamentals 
of Language. The Hague, Netherlands: © 
Mouton, 1956. 


. Luria, A. R., Brain disorders and lan- 


guage analysis. Lang. Speech, 1, 1958, 14- 
34 


. Pick, A., Die agrammatischen Sprachstor- 


ungen. Berlin: 1913. 


. Rapaport, D., Emotions and Memory. 


New York: 
1950. 


International Univ. Press, 


. ScHuELL, Hitprep, Minnesota Test for 


Differential Diagnosis of Apbasia. Minne- 
apolis: Univ. Minnesota Press, 1955. 


. ScHuett, Hivprep, and jenxrns, J. J., The 


nature of language deficit in aphasia. 
Psychol. Rev., 66, 1959, 45-67. 


. Stecet, S., Nonparametric Statistics for 


the Behavioral Sciences. New York: Mc- 
Graw-Hill, 1956. 


. WEltseNnBuRG, T., and McBrive, KatHar- 


INE E., Aphasia. New York: Common- 
wealth Fund, 1935. 

Wepman, J. M., Bock, R. D., Jones, 
L. V., and Van Petr, Doris, Psycho- 
linguistic study of aphasia: a revision of 
the concept of anomia. J. Speech Hearing 
Dis. 21, 1956, 468-477. 





Nasal Syllabics in American English 


/ / 
ANDRE MALECOT 


The purposes of this article are to 
summarize briefly what has been writ- 
ten on the subject of syllabic nasal 
consonants in American English, their 
articulation, acoustic characteristics, 
their role in the phonemic structure of 
the language; to resolve if possible a few 
discrepancies of opinion in these areas; 
and to present experimental data con- 
cerning the acoustic cues for the 
phonemes they purportedly represent. 
This is the last in a series of three 
studies by this writer on nasal con- 
sonant phonemes in different contexts, 
the first dealing with [m n yn] in VC 
and CV syllables (9), the second with 
the reduction of the consonant and as- 
similation of the vowel in tonic VNC 
(N=nasal consonant phoneme) in cases 
where N and C are homorganic (11); 
the present study concerns atonic 
CVN-+V-or-O where the interconso- 
nantal vowel has been suppressed. In all 
other contexts where an N is involved 
in a cluster, it behaves like either an 
initial or a terminal [m n n] as de- 
scribed in the first study. 


Definition. A syllabic nasal consonant 
is best defined as an N functioning as 
the nucleus of a syllable, in other words 





André Malécot (Ph.D., University of Penn- 
sylvania, 1952), Associate Professor of French, 
University of California, Riverside, is on sab- 
batical in France. The exploratory studies 
leading to this study were conducted by him 
at the Haskins Laboratories, New York. By 
mutual agreement of the editors, essentially 
the same material is appearing in Studia Lin- 
guistica (Lund, Sweden, and Copenhagen). 


Volume 3, No. 3 268 


(15, p. 13) ‘one which is the most res- 
onant sound in the syllable’ or one 
(6, p. 69) that ‘can form syllables, 
alone or with other consonants.’ A 
recent textbook (18, p. 129) even goes 
so far as to suggest that they should be 
regarded as vowels ‘since they fulfill 
every requirement of being vowels ex- 
cept that as syllabics they never occur 
initially in English.’ Stetson’s (14, p. 
203) definitions of consonants and 
vowels would appear to support this 
view: ‘Consonants are accessory move- 
ments which arrest and release the chest 
pulse . . . the syllable consists essentially 
of a single chest pulse usually made au- 
dible by a vowel.’ 


Occurrence. Syllabic consonants in 
American English occur mainly in 
atonic position, immediately after an- 
other consonant! (button [batn], Jack 
and Jill [d3ekyd311]), except that nasal 
syllabics cannot follow other nasals. ‘If 
any vowel whatever, no matter how 
obscure or short, intervenes, it becomes 
the syllabic sound and the consonant 
is no longer syllabic’ (6, p. 69). This 
means that syllabic nasal consonants can 
occur only if the two articulations are 
homorganic (rip ’em, written, kickin’) 
or, in the event that they are not, ‘if 
the opening is not made wider than for 
either of the consonants’ (6, p. 89), for 


*Notable exceptions include the single 
‘hmmm’s’ operating as carriers for intonation 
expressions of interrogation, hesitation, dis- 
gust, or the pair indicating affirmation or as- 
sent. 


September 1960 





SO) 





Malécot: Nasal Syllabics in American English 269 





¥ » 
Tie a 
eg: . ? p 
© bona Nie ail Pine ten 
es “e206 
ce 
3 = 





Ficure 1. Spectrograms of two utterances: 
above, hittin’, with a nasal syllabic; below, 
bit ’em, with a vowel and nasal consonant. 


example, rhythm [115m]. In all such 
cases the principal articulatory change 
separating the first consonant from the 
syllabic (the only change when the ar- 
ticulations are homorganic) is the sud- 
den release of the velar occlusion, which 
takes the place of the usual buccal 
release of the first consonant when a 
vowel rather than a syllabic consonant 
follows. Actually, as Kenyon (6) points 
out, nasal syllabics occur most fre- 
quently when the two articulations are 
homorganic. Figure 1 shows spectro- 
grams of two utterances, hittin’ and 
hit ’em, the first with CN, the second 
with VN. 


Phonemic Status. Syllabic nasal con- 
sonants present a dilemma that descrip- 
tive linguists have apparently been 
unable to resolve. The existence of 
contrasts such as evening (gerund of 
‘to even’) [n]: evening (time of day) 
[n] in which the difference is clearly 
phonemic, and not stylistically con- 
ditioned as it is in rhythm [r1dem]-or- 
[116m], prevents their being interpreted 
as mere allophones of nasal consonants. 
Furthermore, as has been seen, syl- 
labics function as vowels. The question 
then is whether they shall be considered 
as nothing more than the phonetic real- 


ization of cases of CVN in which the 
vowel has disappeared—certainly ‘what 
we imagine we pronounce is a con- 
sonant’—or whether the N as a whole 
shall be recognized as a phoneme, or 
perhaps its syllabicity alone, /,/, in- 
terpreted as a secondary phoneme. 

Trager-Bloch (17), Swadesh (16), 
and Gleason (4)—most structural lin- 
guists, it would appear—favor their 
interpretation as VN, partly (12, p. 
140) ‘for structural symmetry’ and 
partly (17, p. 232) because ‘the pho- 
netic similarity of the nucleus of such 
syllables is greatest to some allophones 
of the already established phoneme /2/, 
and this lateral- or nasal-colored sylla- 
bicity is in complementary distribution 
with the members of that phoneme.’ 
Bloomfield (1, p. 123) maintains that 
the ‘syllabic stress’ alone is phonemic, 
that /m/, /n/, should be used except 
in ‘cases where the syllabic value is due 
merely to the character of the sur- 
rounding phonemes.’ Chao and Pike 
go along with ‘the little vertical stroke 
under syllabic [r, ], m, n],’ the latter 
(12, p. 141) with the reservation that 
‘the implication for phonemic theory 
is not clear.’ 

Exactly how this question is resolved 
is largely irrelevant to the present 
study—these logistics have little or no 
bearing on the articulation, transmis- 
sion, or perception of the intended 
phones. What is dismaying in most of 
these analyses is the large number of 
nasal consonants (liquids as well) that 
are erroneously identified as syllabics, 
a practice that certainly contributes 
little to clarifying the problem. In view 
of the articulatory and acoustic char- 
acteristics of syllabic consonants as de- 
scribed earlier, these mistakes become 





270 Journal of Speech and Hearing Research 


immediately obvious: The word bottom 
is a favorite example, and is transcribed 
[batm] or [batm] by Sturtevant, 
Pike, and Bloomfield, to mention only 
a few. As Kenyon (6, p. 91) points 
out, ‘In some cases where a syllabic 
consonant is possible, it is very unlikely, 
as in bottom.’ When a nasal is indeed 
syllabic after a stop, the stop and the 
nasal are homorganic in normal speech, 
and cases such as open and Jack and Jill 
should be resolved, as Kenyon indicates, 
respectively as [op-m] and [d3ek 4 
dz11]. Another current type of mistake 
is to interpret an N as syllabic after 
another N, for example (J, p. 122) 
maintenance ['mejntnns], penance 
[‘penns]. ‘Such transcriptions as [kamn] 
for common, |venm]| for venom are 
wrong; without a slight vowel there 
would be but one syllable’ (6, p. 91). 
Thus, szaintenance should be tran- 
scribed [mejtnons], and penance 
['‘penons]. Bloomfield has perhaps been 
the worst offender in view of the list 
of highly unlikely, if not impossible, 
cases he lists to support his analysis. 
Aside from those already mentioned, 
he presents in his Language (1, p. 122) 
the following contrasting pairs: apron 
[‘ejprn]: pattern [‘petrn], char ’em 
[‘car m]: charm [¢arm], anatomy 
[e'netmij]: met me |'met mij]; the cor- 
rect transcriptions are [ejpron], 
[petsn], [tfarom], [tfarm], 
[anzetomi], and [metmi]. On the other 
hand all those whose aim it is to analyse 
speech as it is normally articulated (6, 
7, 13, 18) give consistently accurate 
transcriptions. 


Synthetic Speech Exploratory Study 


The experimental work of this study 
began with exploratory studies con- 


ducted at the Haskins Laboratories. 
These experiments (2), conducted with 
hand-painted spectrograms played on a 
pattern playback, were undertaken to 
establish optimum spectra for syllabic 
nasals and to determine what contribu- 
tions they make to the identification of 
[m n ny]. Although no attempt was 
made to prove that the values obtained 
correspond to real speech, spectra were 
nevertheless found which, to the trained 
ear, can differentiate between rip ’em 
and written in cases where they are 
appended to a single pattern ambig- 
uously heard as [rp] or [rit], or be- 
tween sittin’ and sicking (sickin’) in 
cases where they are appended to a 
single pattern ambiguously heard as 
{stt] or [stk]. In other words, in the ab- 
sence of other possible cues, those con- 
tained in the formant frequencies of the 
syllabic nasals cause a preceding stop to 
be heard as homorganic with that nasal. 
The contrast between [n] and [n| 
was much less clear than between [m] 
and [n]—it was as if discrimination in 
such cases is on a simple binary basis, 
[m] in one class, [n] and [ny] in the 
other, which would appear to confirm 
an earlier observation to the same effect 
by this writer (9). 

The remaining question to be investi- 
gated with synthetic speech concerned 
the role of the closing transitions of the 
preceding stop in cases such as sip ’em, 
sittin’, sickin’ [sky] in the perceived 
place of articulation of the syllabics— 
this question is practically the same as 
that concerning transition versus steady- 
state formant cues in the earlier study 
on nasal consonants. Appropriate 


[p t k] closing transitions were paired 
in all combinations with the optimum 
[mny] patterns: the stimuli were con- 


Pee 





Sis 
or 


T 


Sel 
re 


oc 
he 
ste 
are 
th: 
cer 
cei 
ex] 
col 
me 
the 


nip 


hop 





=n 


Malécot: Nasal Syllabics in American English 271 


sistently heard as containing the se- 
quences [pm], [tn], or [ky] depending 
on the identity of the stop. 


Tape-Cutting Experiments 

These tentative results suggested a 
series of tape-cutting experiments with 
real speech (0, for description of tech- 
nique) to determine whether commonly 
occurring nasal syllabics are indeed 
heard as homorganic with the preceding 
stop, regardless of how the articulations 
are paired, and to test the hypothesis 
that the closing transitions of the pre- 
ceding consonant determine the per- 
ceived place of both. Finally, a short 
experiment was added to determine if a 
completely nonspeech steady-state seg- 
ment can be successfully substituted for 
the syllabic. 


The tape-cutting work began with a 
preliminary experiment to find appro- 
priate durations for the ‘hold’ (acousti- 
cally, the steady-state portions, that is, 
the silent interval for [p t k], the 
‘voice-bar’ for [b d g], the fricative 
patch for the fricatives) of the con- 
sonant preceding the syllabic, and for 
the syllabic itself. Various hold dura- 
tions ranging from the obviously too 
short to the too long, and disposed in 
steps of 30 msec, were paired with a 
number of Ns. of various durations 
graduated in steps of 50 msec in a 
number of representative cases, such as 
rip ’em, ridden, rizen. The degree of 
naturalness of the stimuli was judged 
by the author and two linguist-col- 
leagues. Optimum values were judged 
to be about 130 msec for the ‘hold’ 


TasLe 1. Interchanges of voiceless stops +- N (nasal syllabic). 











Stimuli Judged as 
Labial Cons + Dental Cons +- Velar Cons + Other 
m n fy) m n n m n y 
nip -+- m 12. 3 4 
np +n 13 1 2 7 1 1 
nip + 9 13 11 1 
nit + m 2 1 15 7 
nit + n 1 1 1 17 4 1 
nit + 4 1 18 4 a 
nk + m 2 11 12 
nk + n 1 1 1 6 16 
nik + 4 3 1 2 3 4 11 1 
hop + m 15 2 2 5 1 
hop + n 18 1 6 
hop + 9 13 3 9 
hot -+ m 1 1 17 1 1 1 
hot + n 5 1 1 17 1 
hot + y 4 16 2 
hok -+ m 6 3 8 1 2 5 
hok + n 10 2 12 1 
hok + y 6 5 2 5 1 2 4 











272 Journal of Speech and Hearing Research 


segment and about 200 msec for the N. 
It is emphasized here that the aim was 
to find values adequate for the proposed 
experiments and not necessarily to 
establish optima for real speech. Never- 
theless, the values correspond with 
those observed in spectrograms and by 
other workers (8). 

Experiment I. The first experiment in 
the final series was designed to test the 
hypothesis that the closing transitions 
of the stop, in cases involving [p t k] 
plus N, determine the perceived place 
of articulation of the syllabic. Segments 
200 msec in duration of recordings of 
[m n y] recorded in isolation, were 
appended after a silent interval of 130 
msec, each in turn to recordings of 
[mp], [nit], [mk], [hop], [hot], 
[hok], also recorded in isolation with 
vowel durations of about 100 msec and 
unreleased [p t k]. These syllables were 
recorded on a tone of about 120 cps, 


normal for a baritone voice, the sylla- 
bics, a half-tone lower. The layout of 
this experiment, together with the re- 
sults, is shown in Table 1. The two 
vowels, [1] and [9], were used to 
indicate possible contingency of the 
phenomenon in question upon vowel 
color. It is also noted that all possible 
combinations shown yield meaningful 
words in American English whether 
standard or colloquial. Examples such 
as hot ’em (from colloquial hotten, ‘to 
heat’) may be considered a bit forced, 
but their inclusion for the sake of com- 
pleteness here is justified by their fre- 
quent use in structural analyses of 
American English. The stimuli were 
randomized and presented, each one 
twice, to a group of 25 phonetically 
naive undergraduates of the University 
of California at Riverside for identifica- 
tion. The only direction given consisted 
in copying the entire list of possibilities 


Taste 2. Interchanges of voiced stops or fricatives ++ N (nasal syllabic). 














Stimuli Judged as 
Labial Cons + Dental Cons + Velar Cons + Other 
m n y m n ) m n y 
rib + m 14 3 2 6 
rib +n 15 3 6 1 
rib + 9 3 5 2 4 2 
rid + m 7 16 1 
rid + n 2 1 19 3 
rid + 7 2 21 2 
ng + m 1 5 8 11 
rg +n 1 1 1 1 7 9 
ng + 9 1 5 8 11 
riz + m 2 23 
riz + n 25 
riz + 3 22 
roz + m 25 
rz +n 3 22 
mz + y 1 23 1 








~ 


~ 





nip 
nit 





~~ 


~ 


~~ 


Malécot: Nasal Syllabics in American English 273 


Tas_e 3. Interchanges of human first syllable + pure tone at 100 cps. 








Stimuli 





Judged as 

Labial Cons + Dental Cons + Velar Cons + Other 

m n yn m n y m n n) 
np + Pure Tone 16 6 1 1 1 
nit + Pure Tone 1 2 1 17 2 1 1 
nik + Pure Tone 5 5 13 2 
hop + Pure Tone 11 5 1 1 5 1 1 
hot + Pure Tone 4 1 l 1 10 2 6 


hok + Pure Tone 13 + 3 








on the blackboard and telling the sub- 
jects that all stimuli would be from that 
list. Judgments entered in the column 
headed ‘other’ represent isolated fail- 
ures to follow these directions. The 
results confirm the hypothesis that Ns 
are identified in these cases, as to their 
place of articulation, on the basis of the 
closing transitions of the preceding 
stop. The only general exception in- 
volves [hok] and is explained by the 
low identifiability of velars after [9], 
a phenomenon noted earlier by this and 
another worker (5, 10), and explained 
by Delattre (3). The judgments in the 
present test indicate almost complete 
confusion in this case. 

Experiment II. A second experiment 
was then designed to determine if these 
same principles hold also when nasal 
syllabics follow voiced stops or frica- 
tives. The layout and results are shown 
in Table 2. The paucity of meaningful 
possibilities involving such -cases is re- 
sponsible for the reduced coverage in 
this experiment—even then, some of the 
possible utterances are meaningless. The 
results, however, confirm those of Ex- 
periment I. 

Experiment Ill. A final experiment 
was added to test a hypothesis suggested 
by the results of the other two exper- 


iments and of previous nasal studies 
that the steady-state N segment serves 
as little more than a class marker, dif- 
ferentiating nasals from other classes of 
consonants, (liquids, for example) in 
the context in question, and that here 
at least some nonspeech segment might 
be substituted without detriment to the 
identification of the nasals. Table 3 
shows the layout and results. A segment 
of pure tone at 100 cps, recorded from 
an electronic signal generator at an 
appropriate amplitude and 200 msec in 
duration was appended, after the usual 
interval of 130 msec, to the initial syl- 
lables of Experiment I. In these con- 
texts the tones were all heard as Ns and 
all identified as homorganic with the 
preceding stop, with the exception of 
[hok] + 100 cps, a further confirma- 
tion of the confusion of velars and [9] 
noted above. 


Effect of Context. One factor that 
undoubtedly plays a very important 
role in the identification of the pho- 
nemes involved, and one not mentioned 
to this point, is context. Previous 
studies (5, 10) have indicated that clos- 
ing transitions in English are relatively 
weak as place cues. Yet the reduction 
of the vowel and the consequent partial 
overlaying of the preceding consonant 





274 Journal of Speech and Hearing Research 


and the nasal in cases such as written, 
[riten] [ritn], resulting in the loss 
of two sets of transitions (those ending 
the first consonant and those beginning 
the N) has occurred readily, and the 
combination remains identifiable. There 
are, furthermore, the cases where the 
articulations are not strictly homor- 
ganic, for example, [ra z]+N=razz 
’em or razzin’. Clearly, context ‘plays a 
major role which, on the one hand, 
explains why pure tones were usable 
in the experiments under discussion, 
and suggests, on the other, that this and 
other substitutions might be made in 
automatic synthesizers operating at the 
receiving end of coded communications 
systems. 


Summary 


The literature on syllabic nasal con- 
sonants in American English is briefly 
summarized. Differences of opinion 
concerning their articulation, acoustic 
characteristics, and role in the phonemic 
structure of the language are discussed. 
Experimental data are given from tape- 
cutting experiments with real speech 
which suggest that certain commonly 
occurring nasal syllabics are identified 
as to their place of articulation by the 
preceding stop; that the closing tran- 
sitions of the preceding consonant de- 
termine the perceived place of both, 
that a completely nonspeech steady- 
state segment can be successfully 
substituted for the syllabic. Finally, the 
suggestion is made that context is im- 
portant as an identification cue. 


References 


1. Broomrietp, L., Language. New York: 
Holt, 1933. 

2. Borst, J., The use of spectrograms for 
speech analysis and synthesis. J. audio 
eng. Soc., 4, 1956, 14-23. 

3. Detatrre, P., Unreleased velar plosives 
after back-grounded vowels. J. acoust. 
Soc. Amer., 30, 1958, 581-582. 

4. Greason, H. A., An Introduction to De- 
scriptive Linguistics. New York: Holt, 
1955. 

5. HouseHotper, H., Unreleased p t k in 
American English. In M. Holde (Ed.), 
For Roman Jakobson. The Hague: Mou- 
ton, 1956. 

6. Kenyon, J. S., American Pronunciation; 
a Textbook of Phonetics for Students of 
English. Ann Arbor: George Wahr, 1943. 

7. Kenyon, J. S., and Knorr, T. A., A Pro- 
nouncing Dictionary of American Eng- 
lish. Springfield: Merriam, 1953. 

8. Lisker, L., Closure duration and the inter- 
vocalic voiced-voiceless distinction in 
English. Language, 33, 1957, 42-49. 

9. Matécot, A., Acoustic cues for nasal con- 
sonants, an experimental study involving 
a tape-splicing technique. Language, 32, 
1956, 274-284. 

10. Matécotr, A., The role of releases in the 
identification of released final stops, a 
series of tape-cutting experiments. Lan- 
guage, 34, 1958, 370-380. 

11. Matécor, A., Vowel nasality as a distinc- 
tive feature in American English. Lan- 
guage (in press). 

12. Pike, K., Phonemics, a Technique for 
Reducing Languages to Writing. Ann 
Arbor: Univ. Michigan Press, 1947. 

13. Porrer, R. K., Kopp, G. A., and Green, 
Harriet C., Visible Speech. New York: 
Van Nostrand, 1947. 

14. Sretson, R. H., Motor Phonetics. Arch. 
néerl. Phon. exp., 3, 1928, 1-216. 

15. Srurtevant, E. H., An Introduction to 
Linguistic Science. New Haven: Yale 
Univ. Press, 1947. 

16. Swavesu, M., Vowels of Chicago English. 
Language, 11, 1935, 148-151. 

17. Tracer, G., and Brocn, B., The syllabic 
phonemes of English. Language, 17, 1941, 
223, 246. 

18. Wise, C. M., Applied Phonetics. Engle- 
wood Cliffs, N. J.: Prentice Hall, 1957. 





Bi 


Le 
Be 
rec 
pei 
‘Be 
the 
res 


me 
Ow 


Th 
eit 


ch: 


cal 
ess 
tha 
chi 
tha 
tio: 
pre 


vel 


att 


thi 
anc 


J: 
ver: 
olo 
of | 
at 
Spe 
Thi 
gral 
He: 
Uni 


Avi 
Basi 


Vo 





Bekesy Audiometry in 


Analysis of Auditory Disorders 


JAMES JERGER 


Less than 14 years has elapsed since 
Bekesy’s original description of a self- 
recording audiometer (2). Within this 
period, however, the technique of 
‘Bekesy audiometry’ has rapidly gained 
the stature of a major clinical and 
research tool in audiology. 

Bekesy audiometry refers to a 
method in which the subject traces his 
own auditory threshold by means of 
a suitable self-recording audiometer. 
The threshold tracing signal may be 
either a fixed frequency or a gradually 
changing frequency, and the signal 
may be either continuous or periodi- 
cally interrupted in time, but the 
essence of Bekesy’s method is, first, 
that the signal intensity is always 
changing at a constant rate, and second, 
that the subject determines the direc- 
tion of this change by alternately 
pressing and releasing a key that re- 
verses the direction of a motor-driven 
_attenuator. He is instructed to press 
this key when he just hears the tone 
and to release it when he just-no-longer 





James Jerger (Ph.D., Northwestern Uni- 
versity, 1954) is Associate Professor of Audi- 
ology, Northwestern University. A portion 
of this article is-based on a paper presented 
at the 1959 Convention of the American 
Speech and Hearing Association, Cleveland. 
This research was supported by research 
grant B-1310 from the National Institutes of 
Health, Public Health Service, and by the 
United States Air Force under Contract 
AF 41(657)-185, monitored by the School of 
Aviation Medicine, USAF, Brooks Air Force 
Base, Texas. 


Volume 3, No. 3 


275 


hears it. By connecting a pen-writing 
system to the attenuator a graphic 
representation, or tracing, of the sub- 
ject’s successive threshold crossings 
may be obtained, 

The Bekesy technique is particu- 
larly useful in psychoacoustics. It lends 
itself admirably, for example, to the 
measurement of temporary threshold 
shift following acoustic stimulation 
and has been so employed by several 
investigators (6, 8, 10, 12, 15, 20, 31, 
32, 33, 34). It finds use in the measure- 
ment of pure-tone masking (3, 5). 

The present paper is concerned, how- 
ever, only with Bekesy audiometry as 
a clinical tool in the evaluation of the 
hearing impaired. In the majority of 
papers concerned with the clinical ap- 
plication of Bekesy audiometry, meas- 
urement and description have been 
confined almost exclusively to the 
width or amplitude of the audiometric 
tracing. This distance or width may 
be expressed either in decibels or in 
number of threshold crossings over a 
given frequency span. In the graphic 
form of the Bekesy audiogram it is 
most easily visualized as the amplitude 
of the oscillating trace. Bekesy (2), in 
his original paper, noted that the 
amplitude became greatly diminished 
in subjects with hearing loss accom- 
panied by loudness recruitment. He 
assumed that the tracing amplitude 
represented the first just-noticeable- 
difference (JND) in loudness and con- 


September 1960 





276 Journal of Speech and Hearing Research 


cluded that,a reduction in its size was 
compatible with the presence of an 
abnormally rapid rate of loudness 
growth with intensity (that is, loud- 
ness recruitment). However, Bekesy’s 
assumption that the amplitude repre- 
sents the first JND has been questioned 
by Hirsh, Palva, and Goodman (9), 
who feel that the amplitude actually 
represents the variability about the 
absolute threshold. 

In any event, subsequent papers on 
Bekesy audiometry have dealt primarily 
with the amplitude aspect of the audio- 
metric tracing (1, 7, 11, 17, 18, 21, 22, 
23, 25, 26, 21, 28, 29, 30, 35, 36). The 
major point of view in this respect is 
best exemplified by the very thorough 
monograph of Lundborg (2/1). This 
investigator obtained Bekesy audio- 
grams on 50 normals, 25 cases of acous- 
tic trauma, 26 cases of Meniere’s disease, 
and 21 cases of diverse retrocochlear 
lesion. He then classified the audiograms 
into four types based on the tracing 


amplitude. There appeared to be a . 


rather precise relationship between 
type of Bekesy tracing and site of 
lesion. Markedly reduced amplitude 
was characteristically present in cases 
with presumably cochlear lesion 
(acoustic trauma and Meniere’s disease) 
but characteristically absent in cases 
with retrocochlear lesion. 


In recent years increasing attention 
has been given to another aspect of the 
Bekesy tracing, the change in threshold 
intensity over time as the subject traces 
threshold at a fixed frequency (4, /4, 
16, 19, 26, 27, 28, 37). Kos (16), Lierle 
and Reger (19), Jerger, Carhart, and 
Lassman (14), and Yantis (37) have 
shown very little change over time in 
presumably cochlear lesion, but marked 


progression toward higher and higher 
threshold intensity over time in retro- 
cochlear lesion. 


The present paper concerns the rela- 
tionship between Bekesy audiometry 
and site of lesion within the auditory 
system. Unfortunately, almost every 
previous writer has confused this issue 
with a quite separate question, the re- 
lationship between the Bekesy tracing 
and the presence or absence of loudness 
recruitment. It must be emphasized, 
therefore, that the present paper is not 
concerned with how Bekesy audiom- 
etry relates to loudness recruitment, 
only with how it relates to site of 
lesion within the auditory system. 


Procedure 


Subjects. This report is based on the 
Bekesy audiograms of 434 subjects 
tested at the Hearing Clinic of the 
Northwestern University Medical 
School over a three-year period. The 
subjects were referred from various 
sources for audiological evaluation. The 
majority were referred by otologists, a 
small number by neurologists and neuro- 
surgeons in the Chicago area. Although 
no formal attempt at random selection 
was made, the series is fairly representa- 
tive of the otologic case load in a large 
hospital environment. In most cases 
Bekesy audiometry was performed as 
part of a larger battery of auditory 
tests typically administered in a three- 
hour test session. Although tracings 
were ordinarily obtained on both ears, 
subsequent analysis is confined to results 
obtained on only one ear of each sub- 
ject. 


Apparatus. All of the tracings on 
which this report is based were ob- 





+S O fe A nN = S @& 


es 
i—4 


ar 
tic 
no 
sul 


ex! 
fo 
ev 


sib 





Jerger: Bekesy Audiometry, Auditory Disorders 277 


tained with a single Bekesy audiometer 
(Grason-Stadler, Model E-800). The 
rate of attenuation change was always 
2.5 db per second, and the rate of fre- 
quency change was always one octave 
per minute. The instrument offered 
the option of a test signal that was 
either continuous or periodically inter- 
rupted in time. In the latter case, the 
interruption rate was 2.5 ips. 

The results reported below involve 
two kinds of tracing, subsequently re- 
ferred to as ‘conventional’ and ‘fixed- 
frequency’ tracings. In conventional 
tracings, the frequency of the test signal 
moved gradually upward from 100 to 
10 000 cps. In fixed-frequency tracings 
the frequency was preset and never 
changed as the subject traced his thresh- 
old over_a_ three-minute period. 

In either case, a complete test always 
consisted of two separate tracings. In 
one the signal was periodically inter- 
rupted, in the second it was continuous 
in time. Both tracings, interrupted and 
continuous, were always made on the 
same piece of graph paper with two 
different colors of ink. It has been 
found convenient to symbolize these 
two conditions by the letters ‘I’ for 
interrupted and ‘C’ for continuous in 
subsequent portions of this report. 


Method. A relatively rigidly stand- 
ardized procedure of test administra- 
tion was initially designed, but could 
not be followed rigorously in all 
subjects due to the occasional subject 
whose ability to understand speech was 
extremely limited. In any event, the 
following instructions were used when- 


ever verbal communication was pos- 
sible: 


When I put these earphones on, you are 
going to hear a beeping sound in your 
ear. As long as you don’t do anything the 
sound will keep getting louder. But you 
can make it fade away by holding down 
this switch. When you let up on the switch 
the sound will get louder again. Now, 
here is what I want you to do. Listen 
very carefully, and, as soon as you hear 
the beeping sound, hold this switch down 
until you can’t hear it any more. As 
soon as the beeping sound is gone, let up 
on the switch until it comes back. Then, 
as soon as you hear it again, hold the 
switch down until it goes away again, 
and so forth. The idea is to keep going 
back and forth from where you can just 
hear the beeping sound to where you 
can just not hear it any more. Never 
let the sound get very loud and never 
let it stay away too long. Hold this 
switch down as soon as you hear the 
sound, then let it up as soon as the sound 
is gone. 

Following these instructions a tracing 
was made with the periodically inter- 
rupted (I) test signal. At the termina- 
tion of this tracing the subject was 


reinstructed as follows: 

Now we are going to do the same thing 
again, but this time the sound will be 
steady instead of beeping on and off. 
Your job is still the same. Hold the switch 
down as soon as you hear the steady 
sound, and let it up as soon as the steady 
sound goes away. 


Following these instructions a tracing 
was made with the continuous (C) 
test signal. This test order, interrupted 
first and continuous second, was used 
in all subjects. Instructions were iden- 
tical for either conventional or fixed- 
frequency tracings. When verbal 
communication was not possible, in- 
structions were effected through pan- 
tomime. 


Findings 


An initial attempt was made to ana- 
lyse and score these Bekesy audiograms 
quantitatively, Various indices, such as 





278 Journal of Speech and Hearing Research 


the width of the continuous tracing in 
db, the number of threshold crossing 
per quarter octave, the difference be- 
tween tracing width at high and low 
frequencies, the difference between 
continuous and interrupted tracing 
widths, and the difference between 
continuous and interrupted mid-points, 
were evaluated, all with exceedingly 
discouraging results. It soon became 
apparent that the range of individual 
variability on any absolute aspect of the 
Bekesy audiogram could be quite sub- 


stantial. A good example is the width 


/of the continuous tracing. In most 
\ Meniere’s patients it is, to be sure, quite 
‘small at high frequencies. On the other 


hand many young adults with oto- 
sclerosis show tracing widths consider- 
ably narrower than a large number of 
older Meniere’s patients. There were, 
indeed, significant group tendencies in 
this quantitative analysis, but the degree 
of overlap among groups appeared to 
limit severely the use of any quantita- 
tive measure as a reliable means of 
differentiating site of lesion. A similar 
conclusion was reached by Landes (17). 

On the other hand, a qualitative 
judgment of the patterning or relation- 
ship between the interrupted and the 
continuous tracings seemed to have 
important diagnostic value. There ap- 
peared to be a unique relationship be- 
tween continuous and _ interrupted 
tracings corresponding to site of lesion 
within the auditory system. 

One may distinguish four basic types 
of relationship, labelled, respectively, 
type I, type II, type III, and type IV. 
They are illustrated in Figures 1 and 
2. Figure 1 shows the four types in the 
case of conventional tracings, Figure 
2 the corresponding types in the case 


of fixed-frequency tracings. Through- 
out these and subsequent figures, green 
denotes the interrupted (I) tracing and 
red denotes the continuous (C) tracing. 


Type I. The type I relationship is 
characterized by an interweaving or 
superposition of continuous and inter- 
rupted tracings, and by a tracing width 
which is constant over frequency and 
averages about 10 db. There is, how- 
ever, considerable variation about this 
mean value. Tracing widths as small as 
3 db and as large as 20 db are not un- 
common. | 

In the case of fixed-frequency trac- 
ings, the type I relationship is reflected 
in two interweaving, horizontal trac- 
ings. 

Type Il. Type II tracings differ from 
type I in two respects. First, the con- 
tinuous tracing drops below the inter- 
rupted at high frequencies, but never 
to a substantial extent. The gap seldom 
exceeds 20 db and ordinarily does not 
appear at frequencies below 1000 cps. 


Second, the width or amplitude of the 


continuous tracing is often quite small 
(3 to 5 db) in these higher frequencies. 
This narrowing of the width or ampli- 
tude of the continuous tracing is, of 
course, the classical Bekesy sign 
thought by many to indicate the pres- 
ence of loudness recruitment. 

In fixed-frequency tracing the type 
II result is quite clear-cut. The inter- 
rupted tracing is, again, horizontal and 
of normal width, but the continuous 
trace drops from 5 to 20 db below the 
interrupted, within the first minute; 
thereafter, it maintains a fairly stable 
level. There is a reliable difference be- 
tween interrupted and continuous trac- 
ings but the difference is relatively 
small and remains quite constant after 





— = 


SS ee 





Jerger: Bekesy Audiometry, Auditory Disorders 279 








20 








40 ii WAWA pVWv ly Ww - 
Pit 5 


Vy f 
60 Vii 





Hearing level in db 


80 


























100 





125 250 500 IK 2K 4K 8K 
Frequency in cps 


ee £ 








20 





40 netsh a 





60 





Hearing level in db 


80 





/ n 























100 





125 250 500 IK 2K 4K 8K 
Frequency in cps 


TYPE IL 








nN 
o 





ti iv 
AAA VV 
yvve 


pS 
°o 





re] 
o 





Hearing level in db 


@ 
°o 


























ro} 
° 





125 250 SOO IK 2K 4K 8K 
‘Frequency in cps 


TYPE I 





°o 





Dw 
°o 





Lh 
° 





® 
o 





Hearing level in db 


a 
°o 


























100 





125 250 500 IK 2K 4K 8K 
Frequency in cps 


TYPE W 


Ficure 1. The four types of conventional Bekesy audiograms. Green represents threshold 
tracing for a periodically interrupted tone, red for a continuous tone. 


the first 60 seconds of tracing. Further- 
more, the difference appears only at 
mid- and high frequencies (that is, 
above 500 to 1000 cps). 


Type Ill. Type III tracings are quite 
dramatic. The continuous tracing drops 
below the interrupted to a remarkable 
degree. Furthermore, the two curves 
may diverge at relatively low frequen- 
cies (100 to 500 cps). It is not uncom- 
mon to observe the continuous tracing 
break away at a frequency as low as 


150 cps and drop to a level as much 
as 40 to 50 db below the interrupted 
tracing. The width of the continuous 
tracing ordinarily remains, however, 
quite normal. 

In type Ill fixed-frequency tracings 
the interrupted tracing is horizontal but 
the continuous drops very rapidly 
and ordinarily does not stabilize at all. 
Typically, the continuous tracing be- 
gins at the same level as the interrupted 
but describes a rapidly descending trace 
to the limit of the equipment. A 40-to- 





280 Journal of Speech and Hearing Research 


20 


40 


60 


Hearing level in db 


80 





100 
0 2 4 6 8 0 ¢ 
Time in minutes 


Bc ae 





250 IK 4K 








20 





40 Wy 





Hearing level in db 


60 





80 


























100 
Oo. 42 4 6 8 10 l2 


Time in minutes 


TYPE I 


Figure 2. The four types of fixed-frequency 
continuous. 


50-db drop within as little as 60 seconds 
is not unusual. 


Type 1V. Type IV tracings more 
closely resemble type II than type III 
but differ in one important respect. 
The continuous tracing falls consist- 
ently below the interrupted at fre- 
quencies below 500 cps. At higher 
frequencies the continuous may fall a 
constant distance below the interrupted, 





0 250 IK 4K 





20 








Wry WT 


60 





Hearing level in db 





80 


























100 
0 2 4 6 8 10 = ‘2 


Time in minutes 














TYPE I 
| | 
250 IK 4k 
0 
20 
40 " A —— i ~ LAs 





Hearing level in db 
ro) 
° 


@ 
° 





























100 
Oo 2 4 6 8 10 12 


Time in minutes 
TYPE WZ 


Bekesy tracings. Green is interrupted; red, 


resembling a type II in this respect. 
The tracing width may or may not 
become abnormally small, further add- 
ing to possible confusion with type II. 
At mid- and high frequencies there 
may even be some overlap between C 
and I. The distinguishing feature, how- 
ever, occurring in both conventional 
and fixed-frequency tracings, is the 
gap between C and I at relatively low 








Hearina level in dh 





* 


xs ( j—_ 


in ae eed 
——e $$$ 


Jerger: Bekesy Audiometry, Auditory Disorders 281 


frequencies (100 to 500 cps). Type IV 
tracings differ from type III tracings 
in that C ordinarily does not show a 
precipitous drop over time. 

The vast majority of Bekesy tracings 
can be fitted into one of these four 
categories quite reliably. There are, 














ie) 
A 
20 : HY 
a : A 
40 bela a ada wl 
Wwe Bae 





Hearing level in db 
® 
oO 



























































80 
100 
125 250 500 IK 2K 4K 8K 
Frequency in cps 
A 
T T I 
250 IK 4K 
ce) 
20 _— 
3 rN, VW 
£ MALAY a thaa An 
pat 40 \y if Ni — —— : 
& 
i 60 
80 
100 














2 4 6 8 10 12 


Time in minutes 
B 


Figure 3. Conventional and fixed-frequency 
Bekesy tracings in a 31-year-old female with 
left otosclerosis: A, conventional tracings; 
B, fixed-frequency tracings. Loudness re- 
cruitment, as measured by the alternate 
binaural loudness balance test, was absent at 
250, 1000, and 4000 cps on the test ear. The 
PB score at SL = 25 db was 100%. Bekesy 
tracings are type I. 








nm 
°o 





ie 


wel Ny 


h 
°o 





Vi) ats 


2) 
o 





Hearing level in db 


@ 
° 





























125 250 500 IK 2K 4K &K 
Frequency in cps 


A 


Hearing level in db 
fen) +b ed 
° re) ° fo) 


@ 
°o 





100 
2 ££. 2 iS 
Time in minutes 
B 

Figure 4. Conventional and fixed-frequency 
Bekesy tracings in a 42-year-old male with left 
Meniere’s disease: A, conventional tracings; B, 
fixed-frequency tracings. Loudness recruit- 
ment, as measured by the alternate binaural 
loudness balance test, was present at 1000 and 
4000 cps but absent at 250 cps. The PB score 


at SL = 25 db was 24%. Bekesy tracings 
are type II. 


however, a small number that, for one 
reason or another, do not appear to fit 
any of the four classic patterns. They 
may be designated by the label ‘ques- 
tionable.’ In some of these, excessive 
tracing width (30 to 40 db) obscures 





282 Journal of Speech and Hearing Research 


the relationship between C and I. In 
others the conventional and _fixed- 
frequency results are contradictory, 
and, in still others, high-pitched tinnitus 








appears to invalidate the continuous 
tracing. There is no unique common- 
fe) 





nN 
o 





s 
oO 





Hearing level in db 
a 
° 





@ 
te} 


























oni 125 250 500 \IK 2K 4K 8K 












































Frequency in cps 
A 
{ l 
250 IK 4K 
ie) 
2 20 
v0 
£ 
3 40 
E 60 
® 
x= 
80 
100 


2 4 6 8 10 12 
Time in minutes 
B 


Figure 5. Preoperative conventional and 
fixed-frequency Bekesy tracings in a 47-year- 
old female with a surgically confirmed right 
acoustic neurinoma: A, conventional tracings; 
B, fixed-frequency tracings. Loudness recruit- 
ment, as measured by the alternate binaural 
loudness balance test, was absent at 4000 cps. 
The PB score at SL = 25 db was 26%. 


Bekesy tracings are type III. 











20 





40 


60 





Hearing level in db 





80 


























100 
12 250 500 IK 2K 4K _ 8K 


Frequency in cps 
A 





1 I I 
250 IK 4K 





20 








40 





Hearing level in db 





80 























100 





2 4 6 8 10 12 
Time in minutes 


B 


Figure 6. Preoperative conventional and 
fixed-frequency Bekesy tracings in a 51-year- 
old female with a surgically confirmed left 
acoustic neurinoma: A, conventional tracings, 
B, fixed-frequency tracings. Loudness recruit- 
ment, as measured by the alternate binaural 
loudness balance test, was absent at 250, 1000, 
and 4000 cps. The PB score at SL = 25 db 
was 58%. Bekesy tracings are type IV. 


ality to these questionable tracings. 
They seem, instead, to reflect a general 
lack of validity. This category en- 
compasses only a relatively small per- 
centage of tracings and does not seem 
to be unique to any particular etiology 
or site of lesion. 





a} 
no 
tre 
tra 


It | 
dis 
10( 
the 
the 
10 





—_ 


— 


Jerger: Bekesy Audiometry, Auditory Disorders 283 


Illustrative Cases. Figures 3, 4, 5, and 
6 illustrate these four basic types of 
tracings as they occur in actual sub- 
jects. Figure 3 shows Bekesy tracings 
in a case of unilateral otosclerosis. Con- 
ventional tracings are type I. The con- 
tinuous tracing and the interrupted 
tracing overlap, and the width or ampli- 
tude of the continuous tracing remains 
essentially normal (that is, about 10 
db). Fixed-frequency tracings are also 
type I. They show essentially horizontal 
tracings, with the continuous and inter- 
rupted interweaving at all test fre- 
quencies. 


Figure 4 shows test results in a case 
of unilateral Meniere’s disease. Here, 
the Bekesy tracings are clearly type II. 
On the conventional tracing C breaks 
away from I at about 500 cps and re- 
mains 10 to 15 db below I out to 8000 
cps. Throughout this range the width 
of the C tracing is quite small. The 
fixed-frequency tracing at 4000 cps 
shows the characteristic initial drop of 
10 to 20 db in the C trace, followed by 
a relatively stable level. In this particu- 
lar case, the width of the C tracing is 
relatively small, but this is not invari- 
ably the case in type II fixed-frequency 
tracings. 

Figure 5 shows Bekesy tracings in 
a subject with a right acoustic neuri- 
noma. Here, one sees a relatively ex- 
treme example of the type III Bekesy 
tracing. On the conventional tracings 
C never does overlap I. Even at 125 cps 
it runs about 35 db below I, and the 
disparity increases with frequency. At 
1000 cps the C trace has dropped to 
the limit of the equipment, whereas 
the I trace is at a hearing level of about 
10 db. 


The fixed-frequency tracing at 250 


cps is quite dramatic. On the left, or 
unaffected, ear, C and I are horizontal 
and overlap. On the right, or affected, 
ear, however, I is stable but C drops 
over 60 db to the limit of the equipment 
in less than 60 secoads. 

The fact that such a phenomenal 
drop should occur at all is remarkable. 
That it should occur for a frequency 
as low as 250 cps is even more remark- 
able. Exploration at lower frequencies 
in this subject revealed the same steadily 
progressive drop in the C trace at 100 
cps, the lowest frequency obtainable 
from the equipment. 

Figure 6 illustrates the type IV trac- 
ing in another surgically-confirmed 
acoustic neurinoma. Neither conven- 
tional nor fixed-frequency tracings 
show the steady decline typical of a 
type III tracing. At the same time the 
relatively large gap between C and I 
at very low frequencies clearly differ- 
entiates this from a type II tracing pat- 
tern. In this particular case the C 
tracing width is relatively small, but 
other type IV tracings show a quite 
normal width. 


Distribution of Patterns. In order to 
study the generality of this apparent 
relationship between type of Bekesy 
tracing and site of lesion within the 
auditory mechanism, all Bekesy audio- 
grams obtained on subjects with hear- 
ing loss in the Northwestern University 
Hearing Clinics were categorized ac- 
cording to type. 

Table 1 shows the number of subjects 
within each of the four categories for 
various etiological subgroups. In the 
case of the acoustic neurinoma group, 
classification is based on surgical con- 
firmation. All other classification by 
subgroup is based on the medical 





284 Journal of Speech and Hearing Research 


TaB_e 1. Distribution of the four Bekesy types (I, II, ITI, IV) and of unclassifiable tracings (?) accord- 


ing to presumed etiology of the hearing loss in 434 subjects. 











Etiology Tracings Total 
f IT III IV ? 

Normal Hearing ~ 33 0 0 0 0 33 
Otosclerosis 50 2 0 0 2 54 
Otitis Media 6 0 0 0 0 6 
Other Conductive Loss 9 0 0 0 0 9 
Meniere’s Disease 4 26 0 1 1 32 
Noise Induced Loss 7 15 0 0 0 22 
Acoustic Neurinoma is 0 0 6 4 0 10 
Unknown Sensorineural Loss 54 119 0 12 10 195 
Presbycusis 24 15 0 2 3 44 
Otosclerosis Plus Sensorineural Loss 2 10 0 1 0 13 
Sudden Onset of Loss 1 1 10 4 0 16 
Total 190 188 16 24 16 434 








diagnosis supplied by staff members of 
the Department of Otolaryngology of 
the Northwestern University Medical 
School. 

Included in this series of 434 ears are 
69 presumably conductive lesions pri- 
marily due to otosclerosis and_ otitis 
media, 54+ presumably cochlear lesions 
due to Meniere’s disease and prolonged 
noise exposure, 10 known eighth nerve 
lesions due to acoustic neurinoma, and 
four subgroups in which the site of 
lesion is less well understood. One 
of these, the sensorineural unknown 
group, constitutes the largest single 
subgroup with 195 subjects. An ad- 
ditional 16 subjects from this group 
are treated separately because of a 
history of relatively sudden onset of 
loss in one ear, without subsequent 
fluctuation. Finally, there are 44 sub- 
jects with presbycusis, and 13 subjects 
with advanced otosclerosis accompanied 
by secondary sensorineural loss. 

Examination of subgroups in Table 
1, for which there is relatively good 
agreement concerning the locus of 


pathology, suggests a fairly strong re- 
lationship between type of Bekesy 
tracing and site of lesion. In lesions of 
the middle ear (otosclerosis, otitis 
media) the type I tracing predominates. 
In cochlear lesion (Meniere’s, noise- 
induced) the type II tracing predom- 
inates although some fall into the type 
I category. No Meniere’s case ever 
showed a type III tracing. In eighth 
nerve lesion (acoustic neurinoma) type 
III and type IV tracings predominate. 
No acoustic neurinoma ever gave a 
type II tracing. 

In view of this compelling relation- 
ship, the results of the analysis in etio- 
logic subgroups of more obscure origin 
are of interest. As might be expected, 
the majority of sensorineural unknowns 
are type II, suggesting cochlear lesion. 
This is also true of otosclerosis ac- 
companied by secondary sensorineural 
loss. 

Almost one-third of the sensorineural 
unknowns, however, show type I trac- 
ings and 12 show type IV tracings. 
This relatively ill-defined group may 


a 


_. = 
tT HH OD A YH ee 





oO 


r 





in 
-d, 
mS 
yn. 
AC- 
ral 


ral 


1g. 
nay 


——7~=_ 





ee 


Jerger: Bekesy Audiometry, Auditory Disorders 285 


possibly include at least two and pos- 
sibly three distinctly different kinds 
of sensorineural loss. In presbycusis the 
situation 
Here, there are actually more type } 
than type II tracings. 

Contrary to expectation, hearing loss 
of sudden onset is primarily type III 
and type IV, suggesting eighth nerve 
rather than, or perhaps in addition to, 
cochlear lesion. 


Discussion 


In certain respects the present results 
do not seem to be in very good agree- 
ment with the findings of some previous 
investigators. Lundborg (2/), for ex- 
ample, apparently observed nothing 
like the present type III tracings in any 
of his 21 cases of retrocochlear lesion. 
His Bekesy thresholds were apparently 
in good agreement with the results of 
conventional threshold audiometry. Nor 
do Palva’s (27) results on 39 cases agree 
with the present findings in fixed-fre- 
quency tracings. After four minutes of 
threshold tracing, there was a change 
of more than 10 db in only one of 
Palva’s 33 perceptive losses. He con- 
cluded (26) that ‘an abnormal loss in 
sensitivity is not common enough to 
give reliable clues to differential diag- 
nosis.’ 

The present findings are far more 
encouraging. They show clear evidence 
of pathological adaptation (types II, 
III, and IV) in 226 of 332 sensorineural 
losses (68% ). Furthermore, the manner 
in which pathological adaptation ap- 
pears to be related to site of lesion sug- 
gests that Bekesy audiometry has the 
potential to become an exceedingly 
sharp tool in the differential diagnosis 
of hearing disorders. 


is even more provocative. / 


Finally, it should be observed that 
the present results are in accord with 


| the previous findings of Dix and Hood © 


(4), Kos (16), Lierle and Reger (19), 
and Yantis (37). 

It may be appropriate to cite two 
possible bases for the lack of agreement 
between the present results and the 
previous findings of Lundborg and of 
Palva. First, the discrepancy may be 
due to a simple artifact of instrumenta- 
tion. Lundborg (21) states that, in his 
Bekesy audiometer, attenuation changed 
in 2-db steps, and Palva (24) states that 
his audiometer changed in 1-db steps. 
It may be that the momentary transient 
energy introduced by each abrupt 
change in level made their continuous 
stimuli more like the interrupted than 
the continuous stimulus used in the 
present study. In the Bekesy audiom- 
eter used in the present experiment, 
successive changes in level were less 
than 0.25 db. This distinction between 
virtually continuous change and change 
in small, discrete steps may very well 
be an exceedingly important variable. 
Jerger and Bucy (13) showed, for ex- 
ample, that only very brief silent inter- 
vals (10 to 20 msec) between successive 
short tones were sufficient to maintain 
a stable horizontal tracing in a patient 
who readily demonstrated a type III 
tracing under continuous stimulation. 

Second, it should be observéd that 
with the exception of Dix and Hood 
(4), who used different instrumenta- 
tion, no previous investigator, to the 
author’s knowledge, has compared the 
continuous threshold tracing with the 
corresponding interrupted threshold 
tracing. Apparently, all previous work- 
ers have employed only a continuous 
stimulus for either conventional or 








286 Journal of Speech and Hearing Research 


fixed-frequency tracing. The present 
results, however, suggest that the com- 
parison between C and I is the key to 
fruitful interpretation of Bekesy trac- 
ings. Type III continuous tracings are, 
to be sure, so dramatic that they are 
easily recognized, but they are com- 
paratively rare. In the vast majority of 
cases showing pathological adaptation 
(type II) the magnitude of the effect 
is not great (5 to 20 db). It occurs, 
furthermore, so rapidly that, if one 
makes measurements only at one minute 
intervals and seeks only a shift in the 
continuous threshold, he is likely to 
observe very little evidence of dra- 
matic adaptation over time. When the 
continuous tracing is compared with 
its interrupted counterpart, however, 
the abnormality is readily recognized. 

Another aspect of interpretation that 
deserves re-emphasis is the relationship 
between pathological adaptation and 
frequency. Again, when marked adap- 
tation occurs (type III), it may 
generally be observed at almost any 


frequency with measurable hearing. * 


But such tracings are, again, compar- 
atively rare. In the more commonly 
encountered type II tracing, pathologi- 
cal adaptation is very definitely a high- 
frequency phenomenon. The manner in 
which the difference between C and I 
relates to frequency is, in itself, a quite 
stable characteristic of the over-all type 
II pattern. 


Summary 

A qualitative analysis of 434 Bekesy 
audiograms suggests that most tracings 
can be placed into one of four cate- 
gories. The basis for categorization is 
the relationship between tracings of 
periodically interrupted and continu- 
ous tonal stimuli. Lesions of the middle 


ear are characterized by one relation- 
ship, lesions of the cochlea by a second, 
and lesions of the eighth nerve by a 
third and fourth. 


Summario in Interlingua 

Un analyse qualitative de quatro 
centos trenta-quatro audiogrammas de 
Bekesy suggere que le major parte del 
audiogrammas pote esser placiate in 
un de quatro categorias. Le base del 
categorisation es le relation inter audio- 
grammas de stimulos tonal que es inter- 
rupte periodicamente e stimulos tonal 
que es continue. Lesiones del aure medie 
demonstra un relation, lesiones del 
coclea demonstra un secunde, e lesiones 
del nervo octave un tertie e quarte. 


Editor’s note: For the interest of Journal 
readers, the author has prepared the above 
Summary in Interlingua, an international aux- 
iliary language developed by the International 
Auxiliary Language Association, 420 Lexing- 
ton Ave., New York 17. As of 1960, 17 Amer- 
ican and five foreign journals are printing 
summaries in Interlingua; two American 
journals are being edited completely in Inter- 
lingua; seven international congresses thus far 
have furnished summaries of all papers in 
Interlingua. The core of this language is the 
vast number of internationally identical tech- 
nical terms already in existence in the various 
national tongues of western culture. A recent 
UNESCO survey indicated that of all existing 
languages, Interlingua has the widest range 
of immediate intelligibility. English is second. 
The Journal will print other summaries in 
Interlingua when these are provided by the 
authors. 


References 


A. Banos, J. L., and Muttins, C. J., Re- 
cruitment testing in hearing and _ its 
implications. Arch. Otolaryng., 58, 1953, 
582-592. 

2. Bexesy, G. v., A new audiometer. Acta 
Otolaryng., 35, 1947, 411-422. 

A. Bircer, R. C., and Hirsn, I. J., Masking 
of tones by bands of noise. J. acoust. 
Soc. Amer., 28, 1956, 623-630. 

A. Dix, M. R., and Hoop, J. D., Modern 
developments in pure tone audiometry 
and their application to the clinical diag- 


in 


Xe- 
its 
53, 


cta 


ing 
ust. 


lern 
etry 
iag- 


lo 


—_ 


Al, 


rte. 


A3. 


At. 


A6. 


Al. 


As. 


Ad. 


AO. 


Al. 


. HEDGECOCK, 


Jerger: Bekesy Audiometry, Auditory Disorders 287 


nosis of end-organ deafness. Proc. R. Soc. 
Med., 46, 1953, 992-994. 


. Exner, R. H., Masking patterns of tones. 


J. acoust. Soc. Amer., 31, 1959, 1115-1120. 


. Epsrem, A., and Scuusert, E. D., Re- 


versible auditory fatigue resulting from 
exposure to a pure tone. Arch. Oto- 
laryng., 65, 1957, 174-182. 

L., The measurement of 
auditory recruitment. Arch. Otolaryng., 
62, 1955, 515-527. 


. Hirsy, I. J., and Bircer, R. C., Audi- 


tory-threshold recovery after exposures 
to pure tones. J. acoust. Soc. Amer., 27, 
1955, 1186-1194. 

Hirsu, I. J., Parva, T., and Goopman, 
A., Difference limen and recruitment. 
Arch. Otolaryng., 60, 1954, 525-540. 


. Hirsn, I. J., and Warp, W. D., Recovery 


of the auditory threshold after strong 
acoustic stimulation. J. acoust. Soc. Amer., 
24, 1952, 131-141. 


Hormi, A. L., Difference limen of in- * 


tensity in hearing impairment due to 
craniocerebral injury. Laryngoscope, 68, 
1958, 808-813. 

Hucues, J. R., Auditory sensitization. 
J. acoust. Soc. Amer., 26, 1954, 1064-1070. 
Jercer, J., and Bucy, P., Audiologic find- 
ings in an unusual case of eighth nerve 
lesion. J. Aud. Res., (in press). 

Jercer, J., Carnart, R., and Lassman, 
Joyce, Clinical observations on excessive 
threshold adaptation. Arch. Otolaryng., 
68, 1958, 617-623. 


. Kopra, L. L., Threshold recoveries for 


continuous and interrupted pure tones 
following auditory fatigue. J. acoust. Soc. 
Amer., 27, 1955, 201. 

Kos, C. M., Auditory function as re- 
lated to the complaint of dizziness. 
Laryngoscope, 65, 1955, 711-721. 

Lanpes, B. A., Recruitment measured by 
automatic audiometry. Arch. Otolaryng., 
68, 1958, 685-696. 

Lwén, G., Loss of hearing following 
treatment with dihydrostreptomycin or 
streptomycin. Acta Otolaryng., 43, 1953, 
551-572. 

Lierte, D. M., and Recer, S. N., Experi- 
mentally induced temporary threshold 
shifts in ears with impaired hearing. Ann. 
Oto. Rhino. Laryng., 64, 1955, 263-277. 
Lierte, D. M., and Recer, S. N., Further 
studies of threshold shifts as measured 
with the Békésy-type audiometer. Ann. 
Oto. Rhino. Laryng., 63, 1954, 772-784. 
Lunpsorc, T., Diagnostic problems con- 
cerning acoustic tumors. A study of 


23. 


24. 


AS. 
26. 


i. 


28. 


A9. 


30. 


31. 


32. 


33. 


34. 


36. 


ie 


. Miskotczy-Fopor, V. 


300 verified cases and the Békésy audio- 
gram in the differential diagnosis. Acta 
Otolaryng., Suppl. 99, 1952. 

F., The Beékésy 
difference limen in bone conduction and 
recruitment. (in German) Pract. Oto. 
Rhino. Laryng., 19, 1957, 282-288. 

Mo ter, F., and Nenzetius, C., An ac- 
cumulation of cases of neurogenous hear- 
ing impairment. Acta Otolaryng., 47, 
1957, 158-166. 

Parva, T., Absolute thresholds for con- 
tinuous and interrupted pure tones. Acta 
Otolaryng., 46, 1956, 129-136. 

Patva, T., Cochlear vs. retrocochlear 
lesions. Laryngoscope, 68, 1958, 288-299. 
Paiva, T.,. Recruitment testing. Arch. 
Otolaryng., 66, 1957, 93-98. 

Parva, T., Recruitment tests at low 
sensation levels. Laryngoscope, 66, 1956, 
1519-1540. 

Parva, T., Self-recording threshold audi- 
ometry and recruitment. Arch. Otolaryng., 
65, 1957, 591-602. 

Ranta, L. J., Acoustic and vestibular 
disturbances following — streptomycin- 
treated tuberculous meningitis in children. 
Acta Otolaryng., Suppl. 136, 1958. 
Recer, S. N., A clinical and research 
version of the Bekesy audiometer. 
Laryngoscope, 62, 1952, 1333-1351. 

Recer, S. N., and Lierte, D. M., Changes 
in auditory acuity produced by low and 
medium intensity level exposures. Trans. 
Amer. Acad. Ophthal. Oto-laryng., 58, 
1954, 433-438. 

Rtepr, L., Actions of vitamin A on the 
human and animal ear. Acta Otolaryng., 
44, 1954, 502-516. 

Scuuttuess, G. v., Evaluation of hearing 
impairment due to industrial noise. Arch. 
Otolaryng., 65, 1957, 512-520. 

Tritror, W. J., Temporary threshold 
shift as a function of noise exposure 
level. J. acoust. Soc. Amer., 30, 1958, 
250-253. 


. Wepenserc, E., Auditory tests on new- 


born infants. Acta Otolaryng., 46, 1956, 
446-461. 

WeopenserG, E., Hereditary background 
of auditory impairment; laboratory de- 
tection of heterozygotes of deafness; a 
Bekesy-audiometric examination of par- 
ents with children deaf from birth. 
Acta Otolaryng., 49, 1958, 451-452. 
Yantis, P. A., Clinical applications of 
the temporary threshold shift. Arch, 
Otolaryng., 70, 1959, 779-787. 











Electrophysiologic Responsiveness 


and Alpha Rhythm in Children 


SIDNEY SCHOENFELD 


ROBERT GOLDSTEIN 


Charan and Goldstein (1), and later 
Rosenbliit, Bilger, and Goldstein (3) 
showed a relation in adults among sex, 
alpha rhythm in the EEG, and electro- 
physiologic responses to sound. Briefly, 
men with a prominent alpha rhythm 
give significantly fewer electrodermal 
responses to sound than do men with 
little or no measurable alpha rhythm. 
This finding is not applicable to women, 
and women as a group give fewer 
electrodermal responses than men. 
These findings account in part for some 
of the variabilities in responses noted 
during electrodermal audiometry on 
adults. Variability in electrodermal re- 
sponsiveness is at least as great among 
children, with and without auditory 
disorders. 

The present study was undertaken 
to determine if some of the variability 
in electrodermal responsiveness in chil- 
dren can be related to differences in 
sex and in alpha prominence. A second 





Sidney Schoenfeld (M.S., Washington Uni- 
versity, 1958) is Clinical Audiologist, Cen- 
tral Institute for the Deaf. Robert Gold- 
stein (Ph.D., Washington University, 1952) 
is Director, Audiology Section, Department 
of Otolaryngology, Jewish Hospital of St. 
Louis. This study was supported by Grant 
B-240 from the National Institutes of Health 
to Central Institute for the Deaf. An adapta- 
tion of this paper was presented at the 1958 
American Speech and Hearing Association 
Convention, New York. 


Volume 3, No. 3 


288 


purpose was to determine if differences 
in conditionability as well as in re- 
sponsiveness could be related to these 
same factors. 


Procedure 

Subjects. The subjects were 12 boys 
and 12 girls with normal hearing. The 
boys ranged in age from 66 to 127 
months, the girls from 61 to 131 
months. 


Apparatus and General Procedures. 
The apparatus for the study was the 
same as that used by Rosenbliit, Bilger, 
and Goldstein (3). The only modifica- 
tion of the previously described ap- 


: paratus (1, 2, 3) was the use of potential 


rather than resistance changes in de- 
tecting electrodermal responses. 

The general procedures for electrode 
placement, EEG control recordings, 
and determination of strength of the 
unconditioned stimulus (shock) were 
essentially the same as in the previous 
studies (J, 3). 


Experimental Session. The condi- 
tioned stimulus was a 1000-cps tone at 
40 db above the predetermined hearing 
level, delivered to the right ear through 
an earphone mounted in a headband. 
An inactive phone on the same head- 
band covered the left ear. Distributed 
as follows through the experimental 
session were 21 tonal stimuli, five elec- 


September 1960 


( 


ae ate 


— 


Schoenfeld, Goldstein 


tric shocks, and four silent control- 
intervals: (a) an adaptation series of 
eight tones and two control intervals, 
(b) a conditioning series of five tones, 
each accompanied for 0.5 sec by a 
shock 4.5 sec after the onset of the 
tone, and (c) an extinction series of 
eight unreinforced tones and two con- 
trol intervals. Three different schedules 
of randomly selected order of tones 
and controls were used. During the test 
only the number from the schedule was 
written alongside the stimulus mark. 
The use of controls and the method of 
designating stimuli by number has been 
described previously (3). The condi- 
tioning series was limited to five trials 
because of the desire to subject the 
children to a minimum of discomfort. 


Analysis of Records 


Analysis of Responses. EDR and EER 
were judged according to the criteria 
described by Rosenbliit, Bilger, and 
Goldstein (3). A simple ‘yes’ or ‘no’ 
decision was made of electric changes 
associated with each tone (or control). 
No further use was made of the con- 
trols except as a base to determine that 
responses to tones from all subjects 
combined were significantly greater 
than chance expectation. 


Classification of EEG Pattern. Wave 
forms in the EEG with a frequency of 
8 to 12 per sec were considered as alpha 
rhythm. This alpha rhythm was studied 
in three 10-sec samples from identical 
portions of each record: prior to, dur- 
ing, and following the experimental 
session. The arbitrary measure of alpha 
prominence was the percentage of time 
that alpha rhythm of at least 30 »V was 
present in the samples. Fortuitously, the 
12 subjects with the most and the least 


: EDR, Alpha Rhythm in Children 289 


prominent alpha rhythm were equally 
divided between boys and girls. In ad- 
dition, each successive group of six in 
order of alpha prominence contained 
three boys and three girls. In the final 
statistical analysis only two groups 
were compared: the six boys and six 
girls with the most prominent alpha 
rhythm with the six boys and six girls 
with the least prominent alpha. 


Results 

Statistical Analysis. The measure 
studied in the following analysis is the 
proportion of responses to each of the 
eight tones in the adaptation and in the 
extinction series. The proportions were 
all transformed to 2 arcsin p’/? accord- 
ing to the technic recommended by 
Walker and Lev (4, p. 423). The ra- 
tionale and justification for this pro- 
cedure are given in an earlier paper (3). 

The statistical design used to evaluate 
the data is a four-way classification 


TaBLE 1. Results of analysis of variance across 
all subjects. 











Source of Variation df ms F 
Sex (S) 1 0.019 
Alpha (A) ui 0.428 
SxA 1 1.385 
Pooled Error (PE) 20 0.833 1.66 
Pre- and Post- 

Conditioning (P) 1 1.323 3.80 
Ps 1 0.287 
Ps A. 1 0.535 1.54 
Ps 8.2 A 1 0.005 
P x PE 20 0.348 
Measures 

(EDR and EER) (M) 1 3.185 8.29* 
MxS 1 0.084 
MxA 1 0.073 
MxSxA 1 0.385 1.00 
M x PE 20 0.384 
M xP 1 0.005 
MxPxS8S 1 0.012 
MxPxA 1 0.012 
M2,Ps Az 5S 1 0.002 
M x P x PE 20 0.069 








* Significant at or beyond the 1% level. 








290 Journal of Speech and Hearing Research 


analysis of variance. For the measures 
effect (EDR and EER), measurements 
were repeated on all subjects. The pre- 
and post- conditioning effect was de- 
termined by taking the proportion of 
responses during the adaptation and the 
extinction series. Sex and EEG patterns 
were independent variables. 

Positive Findings. Table 1 gives the 
results of the statistical analysis. Only 
the ‘measures’ effect is significant, and 
this is attributed to the greater number 
of electroencephalic (EER) than elec- 
trodermal (EDR) responses. 


Discussion 

The one significant effect (measures), 
that is, the greater number of EER than 
EDR, does not relate to differences be- 
tween sex, or between EEG patterns. 
There was no significant ‘measures’ ef- 
fect or interaction in a previous study 
on normal adults (3). 

The failure to achieve a significant 
pre- and post- conditioning interaction 
means that the brief conditioning ses- 
sion (five consecutive reinforcements) 
was not sufficient to produce signifi- 
cantly more responses during extinction 
than during adaptation. Thus, the sec- 
ond of the two questions raised in the 
introduction, that is, is conditionability 
in children related to either sex or alpha 
pattern, remains unanswered. 

With respect to the primary question 
under consideration, the evidence is 
equivocal. Absence of significant inter- 
actions involving sex and EEG patterns 
does not prove that these factors are 
unrelated to variability. However, it 
seems clear that they are not major 
sources of variation. One might specu- 
late that the positive findings in adults 
are associated with the physical changes 
that come with maturity, and with cul- 


tural factors which may bring about 
differences in the responsiveness of 
women and men. 


Summary 


A study was made of 24 normally 
hearing children to determine whether 
or not relations previously noted in 
adults among sex, EEG pattern, and 
electrodermal responsiveness to sound 
would be present. Auditory stimuli 
were 1000-cps tones presented mon- 
aurally at 40-db sensation level. The 
experimental session consisted of eight 
tones without reinforcement, five tones 
reinforced with annoying electric 
shocks, and eight tones without rein- 
forcement. Analysis of variance failed 
to show any significant interactions 
among sex, EEG pattern, and electro- 
dermal responses. 


Acknowledgment 


The authors gratefully acknowledge 
the assistance of Dr. Robert C. Bilger, 


-formerly at Central Institute for the 


Deaf and now in the Department of 
Speech at the University of Michigan, 
for his assistance in the statistical anal- 
ysis of the data. 


References 


1, Cuaran, K. K., and Gotpstein, R., Re- 
lation between EEG pattern and ease of 
eliciting electrodermal responses. J. 
Speech Hearing Dis., 22, 1957, 651-661. 

2. Goxpstein, R., Lupwic, H., and Naun- 
ton, R. F., Difficulty in a oe 
galvanic skin responses: its possible signif- 
icance in clinical audiometry. Acta Oto- 
laryng., 44, 1954, 67-77. 

3. Rosensitt, B., Bircer, R. C., and Gotp- 
sTEIN, R., Electrophysiologic responses to 
sound as a function of intensity, EEG 
pattern and sex. J. Speech Hearing Res., 
2, 1959, 28-39. 

4. Watker, Heten M., and Lev, J., Statis- 
tical Inference. New York: Holt, 1953. 


ae 2. te” oe. oe | “Sehe Gon 


SS Fs wp Aaa 


jm 


= by %acncca. 


Sequence of Action 


of Breathing Muscles during Speech 


MICHAEL S. HOSHIKO 


With recent advances in experimental, 
instrumental, and statistical techniques, 
rapid progress is being made in the de- 
scription of the speech communication 
process. However, as Peterson (12) has 
pointed out, ‘It is about the systems 
which are most readily accessible to 
experimental observation that most in- 
formation concerning the speech pro- 
cess is available.’ The present study was 
planned to explore further the more 
inaccessible aspects of the speech mech- 
anism following the methodology and 
theoretical framework of Raymond H. 
Stetson (13, 14, 15, 16). Although he 
started reporting his studies on move- 
ments during speech as early as 1928, 
it was only recently that his formula- 
tions have found their way into a few 
textbooks and journal articles (1, 5, 7, 
8, 9, 10, 17, 18). As it appears that Stet- 
son’s description of the speech move- 
ments is gaining wider acceptance in the 
formulation of therapies for speech de- 





Michael S. Hoshiko, (Ph.D., Purdue Uni- 
versity, 1957) is Assistant Professor of Speech 
Correction, Southern Illinois University. This 
article is based on a Ph.D. dissertation com- 
pleted under the direction of Professor T. D. 
Hanley. A portion of the article is based on 
a paper presented at the 1958 Convention of 
the American Speech and Hearing Associa- 
tion, New York. The research was supported 
by a grant from the Purdue Research Foun- 
dation and the Special Committee on Scholar- 
ships of the American Speech and Hearing 
Association. 


Volume 3, No. 3 291 


fects, it seems appropriate to review his 
concepts and to extend his research at 
this time. 

The purpose of this study was to 
investigate electromyographically the 
sequence of the activity of the internal 
intercostals, the rectus abdominis, and 
the external intercostals as inferred 
from action-potential patterns during 
the preparation for utterance and the 
utterance of syllables at varying rates 
and during the utterance of a short 
sample of connected discourse at a 
normal rate. 


Procedure 


The experimental procedure of this 
investigation involved the simultaneous 
recording of the electromyographic 
phenomenon and the acoustic event. 


Subjects. The subjects in this study 
were 12 male undergraduates. Athletes, 
obese persons, those physically handi- 
capped, and those with scars from ab- 
dominal operations were rejected. All 
subjects reported normal hearing and 
all were free from speech defects. They 
ranged in age from 17 to 21 years with 
a mean age of 18 years. They possessed 
estimated body surface area ranging 
from 17 037.14 sq cm to 19 257.75 sq cm 
with an average of 18 441.13 sp cm. 
These figures are within the normal 
limits as reported by West (19). 


September 1960 








292 Journal of Speech and Hearing Research 


Instrumentation. Sponge surface elec- 
trodes were used in this study. The 
specific procedures involved in the 
electromyographic aspect of the study, 
including electrode placement, were 
adopted from three sources: Campbell 
(2), Davis (4), and Stetson (14). Ac- 
tion potential was amplified by a Grass 
P 5 preamplifier. The syllables uttered 
by the subjects were received by an 
Electro-Voice model 915 microphone 
and amplified by a Thordarson model 
T 30W03 amplifier. Signals from both 
the preamplifier and the audio amplifier 
were introduced into a DuMont model 
322 dual beam oscilloscope so that the 
signals could be viewed simultaneously. 
A Hewlett-Packard model 202A low 
frequency function generator provided 
time-base signals which were super- 
imposed upon the audio signal. Per- 
manent records were photographed 
from the face of the oscilloscope by 
the use of a Fairchild oscillo record 
camera model F246A. 


Speech Material. Speech material for 
the study was selected on the basis of 
Stetson’s hypothesis that a discrete 
monosyllabic vowel would be released 
by the internal intercostal muscles and 
arrested by the external intercostal 
muscles, while a syllable with a conso- 
nant, vowel, and consonant would be 
released and arrested by the auxiliary 
consonant movement with some aid 
from the external intercostal muscles. 
For a five-syllable series with neither 
releasing nor arresting consonants, rep- 
etitions of the monosyllabic vowel [a] 
were used. For a five-syllable series 
with releasing consonants and arresting 
consonants, repetitions of the syllable 
‘pup’ were used. For the connected dis- 
course this sentence was used: ‘The 


boy called his dog pup, pup pup pup, 
pup.’ The speech material, printed on a 
card, was placed before the subject at 
the proper eye level. 


Electromyogram Recording Proce- 
dure. In an interview, each subject was 
instructed as to the nature of the ex- 
periment and then was given a period 
of practice. Because the oscilloscope 
had only two channels, it was necessary 
for each subject to repeat the entire 
set of speech materials several times 
during the experiment in order that the 
action-potential from each of the three 
muscles could be recorded. One chan- 
nel was used for the action-potential 
and the other for the time line and the 
superimposed acoustic signal. The sub- 
ject was required to utter the vowel 
[a] at a rate of approximately one per 
second and recordings were made of 
the action potentials from the internal 
intercostals. The procedure was re- 
peated for the rectus abdominis and 
again for the external intercostals. The 


. subject was then required to utter the 


vowel [a] as rapidly as he could dis- 
tinctly, repeating as before so that 
electromyograms could be obtained 
from the three muscles. The entire 
procedure (slow and fast) was repeated 
for the syllable ‘pup.’ Finally the sub- 
ject was required to utter the sentence, 
‘The boy called his dog pup, pup pup 
pup, pup,’ at his normal speaking rate 
in three breath groups, trying to say all 
syllables at the same rate, so that elec- 
tromyograms could be obtained from 
the three muscles. There were thus 
three conditions: (a) slow utterance of 
vowel and word; (b) fast utterance of 


vowel and word; (c) normal utterance. 


of sentence. For each condition electro- 


TT ET = a a a ee a oe a ey a omy; 


>a =| = tS 


~ 


Oo —m mm 


Hoshiko: Breathing Muscle Action Sequence in Speech 293 


TaBLe 1. Means and standard deviations (in milliseconds) of the durations of phonation of the mon- 
osyllabic vowel [a] and the syllable ‘pup’ at slow and rapid rates. 











Slow Rate Rapid Rate 

Muscles Utterance po sD pro sD 
Seats [a] 360.3 70.9 234.4 80.7 
Intercostals pup 187.0 90.0 136.7 33.6 
Secu [a] 337.1 65.9 219.1 37.5 
Abdominis pup 187.6 92.6 133.8 29.8 
External [a] 326.9 73.9 251.1 83.6 
Intercostals pup 192.5 29.7 . 130.8 40.4 








myograms were made of all three 
muscles. 


Reliability. Subject’s reliability for 
repeated performances was made by 
comparing the syllable duration for 
each condition. In addition, repeated 
recordings of prephonation action po- 
tential durations from the same muscles 
were obtained from three subjects. 
Comparisons among the repeated per- 
formances from the same muscles from 
three subjects indicated correlation co- 
efficients of .825 and .923 from the 
internal intercostals and .920 and minus 
.193 from the external intercostals. The 
difficulty involved in obtaining records 
from the external intercostal muscles 
may account for the negative correla- 
tion. Comparisons of the durations of 
phonation among the three muscles 
indicated small variations for both slow 
and rapid utterances. Table 1 summa- 
rizes this information. 

Experimenter’s reliability of measure- 
ment of the records was determined by 
repeated measurements of 75 elec- 
tromyograms from five subjects. Ex- 
perimental reliability for repeated 
measurements was .950, .965, and .755 
from the internal intercostals, external 


intercostals, and rectus abdominis, re- 
spectively. 
Results 

In the first condition (slow utter- 
ance of a series of five repetitions 
of discrete chest-release, chest-arrest 
monosyllabic [a] vowels and five repe- 
titions of the consonant-release, con- 
sonant-arrest type of syllable, ‘pup’) 
the subjects’ rate of utterance was 
found to be slightly faster than one 
syllable per second for the vowel [a] 
and 1.7 syllables per second for the 
syllable ‘pup.’ The sequence of activity 
of the muscles used was found to be as 
follows: the internal intercostals led 
slightly, the rectus abdominis followed, 
the external intercostals acted last. Stet- 
son (14, 15) states that the rectus ab- 
dominis, because it is the large abdom- 
inal muscle, leads off with the breath 
movement and, therefore, is the first to 
act, while the external intercostals act 
last to arrest the syllable. In the pres- 
ent investigation, the rectus abdominis 
was found to lag slightly behind the 
internal intercostals, and the two inter- 
costal muscles seemed to contract to- 
gether, with the external intercostals 


lagging slightly. 





294 Journal of Speech and Hearing Research 





; FF0GUG Nites OOD de yu: ates FS 3 Sats VONIICVUYVBRWOVIBUE 


i pa —atinermnat A 
“a ‘ L A 2 mM ns tefnktt iNtae. 
Le ITER TTET EERE, hos hi a arene 














ay 











Figure 1. Action-potential records obtained from the internal intercostal muscles during the 
slow utterance of discrete vowel [a]. The upper line indicates action- ~potential pattern and 
the lower line indicates the time line. The period of each cycle of the time line is 0.002 
sec. The speech signal is superimposed upon the time line and the duration of phonation is 


indicated by the disturbance on the time line. 


Measurement was made of the pre- 
phonation duration of the action po- 
tential, that is, the action potential seen 
on the record prior to the onset of 
phonation. The length of this period 
for the consonant-release, consonant- 
arrest syllables was approximately the 
same as that for the vowel syllables. 

Measurement was made also of the 
postonset duration of the action poten- 
tial, that is, the action potential seen 
on the record immediately after the 
onset of phonation. Examination of 
this p-rt of the record indicated that 
the first muscle to cease activity was 
the rectus abdominis, next was the ex- 
ternal intercostals, and last was the in- 
ternal intercostals. This sequence was 
seen for both types of syllable. Thus, 
it appears that the internal intercostals 
are the first muscles to pulse the syl- 
lable and are also the last muscles to 
stop contracting. 

A sample electromyogram from the 
internal intercostals during the phona- 
tion of the monosyllabic vowel [a] at 
a rate of slightly more than one per 
second (Figure 1) indicates relatively 
little activity just prior to the speech 
attempt, then a burst of action-potential 
activity with onset of phonation indi- 
cated by the disturbance in the time 
line. The total duration of the action- 
potential activity is seen, starting be- 
fore the onset of phonation and con- 


tinuing after the onset of phonation. 
However, the action potential termi- 
nates before the end of phonation. At 
this slowed rate of utterance the mus- 
cles appear to come almost to the rest- 
ing state between syllables. 

In the second condition of this study, 
which is the same as the first except 
that the speed of utterance was in- 
creased, it was found that the subjects 
uttered approximately three vowels per 
second and from 3.6 to 4 syllables per 
second. Stetson (/3) observed that at 
higher rates of utterance the large ab- 
dominal muscles do not pulse for each 
syllable, though chest pulses still may 


. be detected. In the present study, how- 


ever, for the utterance of these rapid 
discrete syllables the rectus abdominis 
still produced individual pulses for each 
syllable. The sequential relationship 
among the muscles was as follows: first 
the internal intercostals, then the rectus 
abdominis, and finally the external in- 
tercostals showed activity. This se- 
quence of muscle activity is the same 
as that found for slow speaking. 
During the more rapid utterances the 
rectus abdominis was the first to termi- 
nate its activity. In the phonation of 
[a] the order of termination of activity 
was as follows: external intercostals, 
internal intercostals, and then rectus 
abdominis. The order of cessation of the 
two intercostals was reversed for the 


























Hoshiko: Breathing Muscle Action Sequence in Speech 295 


Tasce 2. Means and standard deviations (in milliseconds) of durations of action potentials from the in- 
ternal intercostals, external intercostals, and rectus abdominis prior to onset of phonation and after 
onset of phonation for the utterance of vowel [a] and syllable ‘pup’ at slow and rapid rates. 











Muscle Utterance Prephonation Postonset of Phonation 
Slow Rate Rapid Rate Slow Rate Rapid Rate 
Mean SD Mean SD Mean SD Mean SD 
ecial [a] 1176 644 859 220 108.7 559.252.2254 
Intercostals pup =-:142.2,— 37.2 109.2 6.8 46.2 265 29.6 9.7 
inte [a] 1085 50.0 67.41 19.6 498 49.2 294 5.7 
Abdominis pup 1177 541 95.9 28.4 39.3 39.0 1S.S AD 
— [a] 89.7 36.6 = 53.2, 30.1 82.2 > 187 467 254 
Intercostals =” pup ox 6 m3 ea a a ae 








‘pup’ syllables. Comparison between 
duration of action potentials prior to 
onset of phonation and duration of 
action potential after the onset of 
phonation indicated that the standard 
deviations were larger for the postonset 
of phonation duration, pointing to 
greater variability for the termination 
of the action potential. However, some 





80 

TIME IN MILLISECONDS 
Ficure 2. Comparison of the duration of the 
action potential in milliseconds prior to phona- 
tion and after onset of phonation during the 
utterance of discrete vowel [a] and the syllable 
‘pup’ at slow and rapid rates. Vertical line at 0, 
point of onset of phonation; left lines, duration 
of action potential prior to onset of phonation, 
right lines, duration of action potential after the 
onset of: phonation. A, internal intercostal mus- 
cles; B, rectus abdominis muscle; C, external 
intercostal muscles. Solid lines, slow utterance; 
broken lines, rapid utterance. Upper group, [a] 
vowels; lower group, ‘pup’ syllables. 


records indicated a discrete burst of 
action potential which lasted only until 
the onset of phonation. In all cases 
except one for both rates of utterance 
the duration of the prephonation action 
potentials was longer for the syllable 
‘pup’ than for the vowel [a]. Although 
the relative magnitudes of the standard 
deviations (Table 2) were fairly large, 
the trend seemed to indicate that rapid 
ballistic utterance decreased the vari- 
ability. 

Figure 2 summarizes the sequence of 
action among the three muscle groups 
for the two types of syllables and rates. 
The order of onset for both the vowel 
[a] and the syllable ‘pup’ for both 
speeds is as follows: internal intercos- 
tals, rectus abdominis, and external in- 
tercostals. The order of termination for 
the vowel [a] is as follows: rectus 
abdominis, external intercostals, and 
internal intercostals. The order of 
termination for the syllable ‘pup’ is 
rectus abdominis, internal intercostals, 
external intercostals. For the external 
and internal intercostals there is a re- 
versal of the order found for vowel 
[a]. At both the slow and the fast rate, 








296 Journal of Speech and Hearing Research 


the prephonation duration was greater 
for the syllable ‘pup’ than for vowel 
[a] and the postonset of phonation 
duration was shorter for the syllable 
‘pup’ than for the vowel [a]. For the 
vowel [a] there was a longer duration 
before termination of the action po- 
tential than for the syllable ‘pup.’ 

In the third or sentence condition, a 
comparison of electromyograms re- 
vealed a difference between the -con- 
nected syllables at the beginning of the 
sentence (The boy called his dog) and 
the discrete syllables at the end of the 


sentence (pup, pup pup pup, pup). 


Discrete action potential activity did 


not always initiate each of the con- 
nected syllables but pronounced ac- 
tivity initiated each repetition of the 
discrete syllable ‘pup.’ As mentioned 
earlier, Stetson suggested that at the 
higher rates of utterance the large ab- 
dominal muscles do not pulse for each 
syllable. The present study found this 
true for the most part for the connected 


syllables but not for the discrete. More- 


over, the present results show that the 
rectus abdominis appears to be capable 
of pulsing at higher rates than four 
syllables per second, the rate considered 
by Stetson to be the limit. 


Discussion 


One difference between the findings 
of this study and those of Stetson’s 
research is that both internal and ex- 
ternal intercostals were found to be 
active in releasing the syllables, or 
initiating the simple pulses—ballistic 
movements—associated with syllabica- 
tion. None of the records indicated that 
the external intercostals were involved 
in terminating the syllable movement. 
The method of recording action poten- 


tial was not identical with that used by 
Stetson as simultaneous recordings from 
several muscles could not be made. 
However, the method used should not 
have affected the onset of muscle action 
in such a way that if the syllable move- 
ment were terminated by the internal 
intercostals the action potential would 
appear prior to the onset of phonation. 
Another explanation for the lack of 
activity from the external intercostals 
at the end of phonation may be that 
speech activity is also a controlled 
movement. In this case the two inter- 
costals would be expected to be active 
at the same time. A recent study by 
Draper, Ladefoged, and Whitteridge 
(6) also suggests that greater control 
of pressure and hence of the utterance 
may be accomplished with simultaneous 
use of both inspiratory and expiratory 
muscles. 

Present data tend to support the sug- 
gestion that the internal and external 
intercostals cooperate in the releasing 
of the syllable. Stetson’s (13, 14) con- 
cept that the external intercostals are 
necessary to arrest the ballistic move- 
ment in the absence of an arresting 
consonant is not borne out by these 
findings. 


Summary 

Electromyographic investigation of 
the sequence of action during phona- 
tion from three respiratory muscles 
indicated the onset pattern as follows: 
internal intercostals, rectus abdominis, 
and external intercostals. The pattern 
of action potential activity indicated 
that muscles have an onset sequence 
pattern which is maintained in spite of 
changes in speech material and rates of 
utterance. Lack of action potential ac- 
tivity from the external intercostal 





a OE OSS 


soe 


Oo 














Hoshiko: Breathing Muscle Action Sequence in Speech 297 


muscles at the termination of the pho- 
nation suggested that this arresting ac- 
tion may depend upon other muscle 
activity. 


References 


i 


Berry, Mitprep, and E1senson, J., Speech 
Disorders; Principles and Practices of 
Therapy. New York: Appleton-Century- 
Crofts, 1956. 


. Campsett, E. J. M., Muscular control of 


breathing man. Ph.D. dissertation, Univ. 
London, 1954. 


. Curtis, J. F., Systematic research in ex- 


perimental phonetics: 3. The case for 
dynamic analysis in acoustic phonetics. 
J. Speech Hearing Dis., 19, 1954, 147-157. 


. Davis, J. F., Manual of Surface Electro- 


myography. Montreal: Allan Mem. Inst. 
Psychiat., McGill Univ., 1952. 


. DiCarto, L. M., and Amster, W. W., 


Hearing and speech behavior among chil- 
dren with cerebral palsy. In W. M. 
Cruickshank and G. M. Raus (Eds.), 
Cerebral Palsy, its Individual and Com- 
munity Problems. Syracuse: Syracuse 
Univ. Press, 1955. 


. Draper, M. H., Laperocen, P., and Wuit- 


TERIDGE, D., Respiratory muscles in speech. 
J]. Speech Hearing Res., 2, 1959, 16-27. 


. Fornercitt, Patri, and Harrincton, R., 


The clinical significance of the stretch 
reflex in speech reeducation for the spas- 
tic. J. Speech Hearing Dis., 14, 1949, 353- 
355. 


. Hupeins, C. V., Visual aids in the correc- 


tion of speech. Volta Rev., 37, 1935, 637- 
643. 


9. 


10. 


if. 


Ye 


13, 


17. 


18. 


Jounson, W., Brown, S. F., Curtis, J. F., 
Epney, C. W., and Keaster, JACQUELINE, 
Speech Handicapped School Children. 
(rev. ed.) New York: Harper, 1956. 
Kaiser, Louise (Ed.), Manual of Pho- 
netics. Amsterdam: North Holland Pub., 
1957. 

McDonatp, E. T., and Koepp-Baxer, H., 
Cleft palate speech: an integration of re- 
search and clinical observation. J. Speech 
Hearing Dis., 16, 1951, 9-20. 

Peterson, G. E., Systematic research in 
experimental phonetics: 4. The evaluation 
of speech signals. J. Speech Hearing Dis., 
19, 1954, 159-168. 

Stetson, R. H., Motor phonetics. Arch. 
néerl. Phon. exp., 3, 1928, 1-216. 


. Stetson, R. H., Motor Phonetics; a Study 


of Speech Movements in Action. (2nd 
ed.) Amsterdam: North Holland Pub., 
1951 (for Oberlin Coll.). 


. Stetson, R. H., Speech movements in ac- 


tion. Trans. Amer. laryng. Ass., 55, 1933, 
29-41. 


. Stetson, R. H., and Huperns, C. V., 


Functions of the breathing movements in 
the mechanism of speech. Arch. néerl. 
Phon. exp., 5, 1930, 1-30. 

Strotuer, C. R., Voice training after lar- 
yngectomy. In Emil Froschels (Ed.), 
Twentieth Century Speech and Voice 
Correction. New York: Philosophical 
Libr., 1948. 

Van Riper, C., and Irwin, J. V., Voice 
and Articulation. Englewood Cliffs, N. J.: 
Prentice-Hall, 1958. 


. West, H. F., Clinical studies on the res- 


piration, a comparison of various stand- 
ards for the normal vital capacity of the 
lungs. Arch. intern. Med., 25, 1920, 306- 
316. 











Book Reviews 


Béxésy, Georc von, Exeperiments in Hearing. 
New York: McGraw-Hill, 1960. Pp. 745. $25. 


In the field of hearing, this book, beyond a 
doubt, is the most important, the most com- 
prehensive, the most inspiring since the time 
of Helmholtz. Like parts of Helmholtz’s Sen- 
sations of Tone this book of Békésy’s is based 
upon experiment, but these are marvelous ex- 
periments. It is not possible to describe the 
thrill of the experience of just looking 
through this book, or to transmit the awe 
arising from seeing, as one reads, Békésy’s 
extreme breadth of information, his perfect 
insight into the problems, and the exact exe- 
cution of the most delicate experiments. This 
book is another landmark; in fact, in this re- 
viewer’s opinion, it is, in the field of hearing, 
the most significant accumulation of informa- 
tion since the game of ‘how does it work’ got 
started. It is not feasible to list the contribu- 
tions to knowledge of the ear and hearing 
that are described in this book; practically 
every page contains something of importance. 
Suffice it to say that here in one place and 
in clear concise English is the accumulation 
of all the scientific reports that Békésy has 
contributed to our field in the past 34 years. 
Professor E. G. Wever, a recognized con- 
tributor in his own right, aided in the transla-- 
tion and editing of the book so that it is 
presented in uniform, readable style. 

Part of the readability is provided by the 
organization of the original scientific articles 
by subject rather than date of publication. 
This provides at once a great advantage for 
the reader who may recall that Békésy wrote 
something about a particular line of investi- 
gation but who may not remember where 
in the literature it is to be found. This is not 
always a guarantee, however, that the reader 
will find what he is looking for by following 
the subject headings; Békésy often considers 
many things in one article and by keeping 
these reports as units, sometimes important 
experiments appear somewhere else in the 
book. Fortunately there is a very good index 





Book Reviews is edited by Ernest H. Hen- 
rikson, Professor of Speech and Director of 
the Speech and Hearing Clinic, University of 
Minnesota. 


Volume 3, No, 3 


298 


which provides, in most instances, an ade- 
quate lead. 

There are four basic parts under which the 
articles are classified. Part I, ‘Introduction,’ 
contains discussions of the problems in audi- 
tory research, of the anatomy of the ear and 
of Békésy’s unique and original experimental 
apparatus and methods for employing them. 
Part II, ‘Conductive Processes,’ contains chap- 
ters concerned with the action of the middle 
ear and with bone conduction. Part III, “The 
Psychology of Hearing, consists of the many 
publications on auditory thresholds, the spa- 
tial attributes of sounds, the problems of 
distortion and recom acoustics. Part IV, ‘Coch- 
lear Mechanics,’ is Békésy’s forte; here in one 
place are described the classic experiments 
on the pattern of vibration within the coch- 
lea, wave motion within the cochlea, fre- 
quency analysis, and the electrophysiology of 
the cochlea. Then there is appended the au- 
thor’s bibliography in this field. The number 
of pages in each part is an indication of where 
Békésy’s interest lies although this does not 
minimize the depth of the research in any 
one area. There are 30 pages devoted to 
anatomy and techniques, but many other 
methods are presented later in the book; 
there are 86 pages on the conductive proc- 
esses; 185 pages on experiments related to 
the psychology of hearing; and 300 pages cov- 
ering cochlear mechanics. 

Békésy’s publications have always been 
noted for their clear line drawings and 
graphs. Here they are again produced in the 
same explicit style, and with English labels. 
Unfortunately a few of the half-tone pictures 
do not always turn out so well, especially 
those illustrating some of the early deca 
used in the investigation of human temporal 
bones, but the original of these, no doubt, is 
no longer available, and we should consider 
ourselves fortunate that they appear at all. 

Dr. Békésy has been an indefatigable and 
undaunted investigator. Practically every 
word he writes is a contribution and many 
experiments were performed under conditions 
where lesser men would have been greatly 
discouraged. These early experiments, from 
1924 to 1946, were carried out in the Royal 
Hungarian Institute for Research in Teleg- 
raphy in Budapest and there was a time when 
Békésy and his staff removed stones from the 


September 1960 








en 
nd 
he 
‘Is. 
res 
lly 
1€s 
ral 


jer 


cry 
ny 
ons 


atly 


yal 
leg- 
hen 
the 


960 








walls of the laboratory so that valuable instru- 
ments could be sealed behind to hide them 
from the unscrupulous invaders. Rarely does 
Dr. Békésy talk of these days but this review- 
er has been present on a few occasions when 
he has. Those days were not easy. Experi- 
ments in 1947 were performed in the Depart- 
ment of Telegraphy and Telephony of the 
Royal Institute of Technology in Stockholm, 
and since that time in the Psychoacoustic 
Laboratory of Harvard University. This book 
is not the final mark; Békésy has years of 
active experimentation ahead of him and 
there are many things about the ear that are 
still unknown. 


Experiments in Hearing, though expensive, 
should be near the hand of everyone dealing 
with the ear and hearing: otologists, anato- 
mists, physiologists, physicists, psychologists, 


Book Reviews 299 


and audiologists. It is more than something 
to be handy in the library. Everyone inter- 
ested should have his own copy; it will take 
a lot of poring over and is constantly useful. 


Mere LAwreENCE 
University of Michigan 


Author’s comment: Thank you very much 
for sending me a copy of the review by Pro- 
fessor Merle Lawrence of my book Experi- 
ments in Hearing. As the review is very 
favorable and to the point, I think I could 
only harm Professor Lawrence’s statements 
if I were to comment on his review. If you 
don’t mind, I should therefore love to let the 
review stand as it is. 

G. v. BEKésy 
Harvard University 








